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TITi3 OF IBE INVENTION 
HEPATritS C VIRUS VACCINE 

RELATED APPUCAHONS 
5 The present application claims priority to provisional applica 

Serial No. 60/363 filed March 13, 2002, and U.S. Serial No. 60m^ 
October 1 1 , 2001 , each of which are hereby incorporated by refbrenee herein. 

BACKGROUND OF THE INVENTION 
10 The references cited in the present application are not admitted to be 

prior art to the clainied invention. 

About 3% of the world^s population are infected with the Hepatitis C 

virus (HCV). (Wasley et al, Semin, Liver Dis, 20, 1-16, 2000.) Exposure to HCV 

results in aii overt acute disease in a small percentage of cases, while in most 
IS instances the virus establishes a chronic infection causing liver inflammation and 

slowly progresses into liver failure and cirrhosis, (twarsori, FEMS Microbiol. Rev. 14; 

201-204^ 1994.) In addition, epidemiological surveys indicate an important role of : 

HCV in the pathogenesis of hepatocellular carcinoma. (Kew, FEMS Microbiol Rev. 

i^, 211-220, 1994, Alter, BZood 85, 1681-1695, 1995.) 
20 Prior to the implementation of routine blood screeiiiiig for HCV in 

1992, most infections were contracted by inadvertent exposure to contaminated blood, 

blood products or transplanted organs. In those areas where blood screening 6f MCV 

is carried out, HCV is primarily contracted through direct percutaneous exposure to 

infected blood, i.e:, intravenous drug use. Less frequent methods of transmission 
25 include perinatal exposure, hemodialysis, and sexual contact with an HCV infected 

person. (Alter et al, N. Engl J. Med. 341(8), 556-562, 1999, Alter, /. Hepatol 31 

Suppl 88-91, 1999. Semin. Liver. Dis. 201, 1-16, 2000.) 

The HCV genbme consists of a single strand RNA about 9.5 kb 

Encoding a precursor polyprotein of about 3000 kmindi acids. (Choo et al. Science 
30 244, 362-364, 1989, Choo ef a/;. Science 244, 359-362, 1989, Tak^zawa oZ., t 

Virol 65\ 1105-1113i 1991.) The HCV polyprotein contains the viral proteins in the 
. order: C-El-E2.p7-NS2-NS3-NS4A-NS4B-NS5A-NS5B. 

Individual viral proteins are produced by proteolysis of the HCV 

polyprotein. Host cell proteases release the putative structural proteins C, El, E2, and 
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p7. and create the N-teraiinus of NS2 at amino acid 810. (Mizushima et al, J. Virol 
68, 2731-2734. 1994, Hijikata et al, P.N.AS. USA 90, 10773-10777, 1993.) 

The non-structural proteins NS3, NS4A, NS4B, NS5A and NS5B 
presumably form the virus replication machinery and are released from the 
5 polyprotein, A zinc-dependent protease associated with NS2 and the N-terminus of 
NS3 is responsible for cleavage between NS2 and NS3. (Grakoui et al, J. Virol 67, 
1385-1395, 1993, Hijikata era/., RNA.S. USA 90, 10773-10777, 1993.) A distinct 
serine protease located in the N-terminal domain of NS3 is responsible for proteolytic 
cleavages at the NS3/NS4A, NS4A/NS4B, NS4B/NS5A and NS5A/NS5B junctions. 
10 (Bartenschlager et al, J. Virol 67, 3835-3844, 1993, Grakoui et al, Proc. Natl Acad, 
ScL USA 90, 10583-10587, 1993, Tomei et al., /. Virol 67, 4017-4026, 1993.) 
NS4A provides a cofactor for NS3 activity. (Failla ef a/., /. V/roi. (55, 3753-3760, 
1994, De Francesco aL, U.S. Patent No. 5,739,002-) 

NS5 A is a highly phosphorylated protein conferring interferon 
15 resistance. (De Francesco et al, Semin. Liver Dis., 20(1), 69-83, 2000, Pawlotsky, 
Viral Hepat.SuppLl,47^S, 1999.) 

NS5B provides an RNA-dependent RNA polymerase. (De Francesco 
et al. International Publication Number WO 96/37619, Behrens et al, EMBO 15, 12- 
22, 1996, Lohmann eraZ., Wrotogy 249, 108-118, 1998.) 

20 

SUMMARY OF THE INVENTION 

The present invention features Ad6 vectors and a nucleic acid encoding 
a Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide containing an inactive NS5B 
RNA-dependent RNA polymerase region. The nucleic acid is particularly useful as a 
25 component of an adenovector or DNA plasmid vaccine providing a broad range of 
antigens for gen^ating an HCV specific cell mediated immune {CMS) response 
against HCV. 

A HCV specific CMI response refers to the production of cytotoxic T 
lymphocytes and T helper cells that recognize an HCV antigen. The CMI response 
30 may also include non-HCV specific inmiune effects. 

Preferred nucleic acids encode a Met-NS3-NS4A-NS4B-NS5 A-NS5B 
polypeptide that is substantially similar to SEQ. ID. NO. 1 and has sufficient protease 
activity to process itself to produce at least a polypeptide substantially similar to the 
NS5B region present in SEQ. ID. NO. 1. The produced polypeptide corresponding to 
35 NS5B is enzymatically inactive. More preferably, the HCV polypeptide has sufficient 
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protease activity to produce polypeptides substantially similar to the; NS3, NS4A. - .. . 
NS4B, NS5A, and NS5B regions present in SEQ, E). NO. 1. 

Reference to a "substantially similar sequence" indicates aii identity of at v 
least about 65% to a reference sequence. Thus, for example, polypeptides having an qmirio 
5 acid sequence substantially similar to SEQ. ID. NO. 1 have an overall amino acid identity 
oiF at least about 65% to SEQ. ID. NO. 1. 

Polypeptides corresponding to NS3, NS4A, NS4B, NS5A, and NS5B , 
have an amino acid sequence identity of at least about 65% to the corresponding 
region in SEQ. ID. NO. 1. Such corresponding polypeptides are also referred to 
10 herein as NS3,NS4A,NS4B,NS5A, and NS5B polypeptides. 

Thus, a first aspect of the present invention describes a nucleic acid 
comprising a nucleotide sequence encoding a Met-NS3-NS4A-NS4B-NS5 A-NS5B 
polypeptide substantially similar to SEQ. ID. NO. 1. The encoded polypeptide has 
sufficient protease activity to process itself to produce an NS5B polypeptide that is : 
15 enzyinatically inactive. 

In a preferred embodiment, the nucleic acid is an expression vector 
capable of expressing the Met-NS3-NS4A-NS4B-.NS5A-NS5B polypeptide in a 
desiribd hurnaii cell. Expression inside a humaii cell has therapeutic applications for 
actively treating an HC V infection and for prophylactically treating against an iSC V 
20 iiiifection. . : ^ - ^ V rA^^r^^r^^r 

An expression vector contains a nucleotide sequence encoding a " 
polypeptide along with regulatory elements for proper transcription and processing. 
The regulatory elements that may be present include those naturally associated with 
the nucleotide sequence encoding the polypeptide and exogenous regulatory elements 
25 not naturally associated with the nucleotide sequence. Exogenous reguldtory elements 

such as an exogenous promoter can be useful for expression in a particular host, such ; 
as ill a humati cell. Examples of regulatory elements useful for functioilal expression^; r 
include a promoter, a terminator, a ribosome binding site, and a polyadenylatioii 
signal. 

30 Another aspect of the present invention describes a nucleic acid 

comprising a gerie expression cassette able to express in a human cell a Met^NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide substantially siinilar to SEQ. ID. NO. 1. The 
polypeptide can process itself to produce an enzymatically inactive NS5B protein. The 
gene expression cassette contains at least the following: 
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a) a promoter transcriptionally coupled to a nucleotide sequence ' 
encoding a polypeptide; 

b) a 5' ribosome binding site functionally coupled to the nucleotide 

sequence, 

5 c) a terminator joined to the 3 ' end of the nucleotide sequence, and 

d) a 3' polyadenylation signal functionally coupled to the nucleotide 

sequence. 

Reference to "transcriptionally coupled" indicates that the promoter is 

positioned such that transcription of the nucleotide sequence can be brought about by 
10 RNA polymerase binding at the promoter. Transcriptionally coupled does not require 

that the sequence being transcribed is adjacent to the promoter. 

Reference to "functionally coupled" indicates the ability to mediate an 

effect on the nucleotide sequence. Functionally coupled does not require that the 

coupled sequences be adjacent to each other. A 3' polyadenylation signal functionally 
IS coupled to the nucleotide sequence facilitates cleavage and polyadenylation of the 

transcribed RNA. A 5* ribosome binding site functionally coupled to the nucleotide 

sequence facilitates ribosome binding. 

In preferred embodiments the nucleic acid is a DNA plasmid vector or 

an adenovector suitable for either therapeutic application in treating HCV or as an 
20 intermediate in the production of a therapeutic vector. Treating HCV includes 

actively treating an HCV infection and prophylactically treating against an HCV 

infection. 

Another aspect of the present invention describes an adenovector 
comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette able to express 
25 a polypeptide substantially similar to SEQ. ID. NO. 1 that is produced by a process 
involving (a) homologous recombination and (b) adenovector rescue. The 
homologous recombinant step produces an adenovims genome plasmid. The 
adenovector rescue step produces the adenovector from the adenogenome plasmid. 

Adenovirus genome plasmids described herein contain a recombinant 
30 adenovims genome having a deletion in the El region and optionally in the E3 region 
and a gene expression cassette inserted into one of the deleted regions. The 
recombinant adenovirus genome is made of regions substantially similar to one or 
more adenovims serotypes. 

Another aspect of the present invention describes an adenovector 
35 consisting of the nucleic acid sequence of SEQ. ID. NO. 4 or a derivative thereof, 
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wherein said derivative theredf has the HCV polyproteiri encocMng sequence present ^ 
in SEQ. ID. NO, 4 replaced with the HCV polyprotein encoding sequence of either 
SEQ. ID. NO. 3, SEQ. ID. NO, 10 or SEQ. ID. NO. 1 1, 

Another aspect of the present invention describes a cultured 
S reconibinant cell comprising a nucleic acid containing a sequence encoding a Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. BD. NO. 1. 
The recombinant cell has a variety of uses such as being used to replicate nucleic acid 
encoding the polypeptide in vector construction methods. 

Another aspect of the present invention describes a method of making 

10 an adenovector comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette 
able to express a polypeptide substantially similar to SEQ. ID. NO, 1. The method 
involves the steps of (a) producing an adenovirus genome plasmid containing a 
recombinant adenovirus genome with deletions in the El and E3 regions and a gene 
expres$ion cassette inserted into one of the deleted regions and (b) rescuing the 

is adenovector from the adenovirus genonie plasniid. 

Another aspect of the present inyentibii describes a pharmaceutical 
composition comprisitig a vector for expressing a Met-NS3^NS4A-NS4B--NS5 A^ ' ' 
NSSJEf polypeptide substaiitially similar to SEQ. ID. NO. 1 and a phaimaceuticaUy 
acceptable carrier. The vector is suitable for administration and polypeptide 

20 expression in a patient. 

A "patient" refers to al manmial capable of being infected 
A patient ma:y or may not be itifected with HCV. Examples of piitients are huiiiaiis ' 
arid chimpanzees. 

Another aspect of the present invention describes a method of treating 

25 a patient cornprising the step of administering to the patient aih effective amourit Of a 
vector ext)ressing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substahtially 
siiiiilar to SEQ. ID. NO. 1. The vector is suitable for achninistration aild polypeptide 
expression in the patient. 

The patient undergoing tteatment may of may not be infected 

30 HCV. For a patient infected with HCV, an effective amount is sufficient to achieve I 
one of more of the following effects: reduce the ability of HCV to repUcatei reduce 
HCV loadj increase vifal clearance, and increase one of niofe HCV specific CMI , / 
responses. For a patient not infected with HCV, an effective amount is sufficient to 
achieve one of more of the following: an increased ability to produce one of more 

35 components of a HCV specific CMI response to a HCV infection^ a reduced 
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susceptibility to HCV infection, and a reduced ability of the infecting virus to 
establish persistent infection for chronic disease. 

Another aspect of the present invention features a recombinant nucleic 
acid comprising an Ad6 region and a region not present in Ad6. Reference to 
5 "recombinant" nucleic acid indicates the presence of two or more nucleic acid regions 
not naturally associated with each other. Preferably, the Ad6 recombinant nucleic 
acid contains Ad6 regions and a gene expression cassette coding for a polypeptide 
heterologous to Ad6. 

Other features and advantages of the present invention are apparent 
10 from the additional descriptions provided herein including the different examples. 
The provided examples illustrate different components and methodology useful in 
practicing the present invention. The examples do not limit the claimed invention. 
Based on the present disclosure the skilled artisan can identify and employ other 
components and methodology useful for practicing the present invention. 

15 . :V 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures lA and IB illustrate SEQ. ID. NO. 1. 

Figures 2A, 2B, 2C, and 2D illustrate SEQ. ID. NO. 2. SEQ. ID. NO. 

2 provides a nucleotide sequence coding for SEQ. ID. NO. 1 along with an optimized 
20 internal ribosome entry site and TAAA termination. Nucleotides 1-6 provides an 

optimized internal ribosome entry site. Nucleotides 7-5961 code for a HCV Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide with nucleotides in positions 5137 to 
5145 providing a AlaAlaGly sequence in amino acid positions 1711 to 1713 that 
renders NS5B inactive. Nucleotides 5962-5965 provide a TAAA termination. 
25 Figures 3A, 3B, 3C, and 3D illustrate SEQ. ID. NO. 3. SEQ. ID. NO. 

3 is a codon optimized version of SEQ. ID. NO. 2. Nucleotides 7-5961 encode a 
HCV Met.NS3-NS4A-NS4B-NS5A.NS5B polypeptide. 

Figures 4A-4M illustrate MRKAd6-NSmut (SEQ. ID. NO. 4), SEQ. 
ID. NO. 4 is an adenovector containing an expression cassette where the polypeptide 

30 of SEQ. ID. NO. 1 is encoded by SEQ. ID. NO. 2. Base pairs 1-450 correspond to the 
Ad5 bp 1 to 450; base pairs 462 to 1252 correspond to the human CMV promoter; 
base pairs 1258 to 1267 correspond to the Kozak sequence; base pairs 1264 to 7222 
correspond to the NS genes; base pairs 7231 to 7451 correspond to the BGH 
polyadenylation signal; base pairs 7469 to 9506 correspond to Ad5 base pairs 3511 to 

35 5548; base pairs 9507 to 32121 correspond to Ad6 base pairs 5542 to 28156; base 
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pairs 32122 to 35117 correspond to Ad6 base pairs 30789 to 33784; and base pairs . • 
35118 to 37089 correspond to AdS base pairs 33967 to 35935. 

Figures 5A-50 illustrate SEQ. ID- NOs. 5 and 6. SEQ, iD. NO. 5 
encodes a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide with an active 
5 RNA dependent RNA polymerase. SEQ- ED. NO. 6 provides the ainino acid sequence 
for the polypeptide. 

Figures 6A-6C provide the nucleic acid sequenced for pVl JnsA (SEQ/ 

ID. NO. 7). 

Figures 7 A-7N provide the nucleic acid sequence for the Ad6 genome : 
10 (SEQ.ID.NO. 8). 

Figures 8A-8K provide the nucleic acid sequence for the AdS genome 
(SEQ. ID. NO. 9). 

Figure 9 illustrates different regions of the Ad6 genome. The linear 
(35759 bp) ds DNA genome is indicated by two parallel lines and is divided into IQO 
15 map units. Transcription units are shown relative to their position and bri0ntatidn iti 
the genome. Early genes (El A, ElB, E2A/Bi E3 and E4 are indicated by gray arrows - 
Late genes (LI to L5) , indicated by black arrows, are produced by alt0rnatiye splicing 
of a transcript produced from the major late promoter (MLP) and all contain the 
tripartite leader (1, 2, 3) at their 5' ends. The El region is located from approximately; 
20 1.0 to ll.S map units, the E2 region from 75.0 to 11. 5 map units, E3 fr6rii76.l to 86.7 
. map units, and E4 from 99,5 to 91.2 map units. The major late transcription unit is 
located between 16.0 and 9 L2 map units. 

Figure 10 illustrates homologous recombination to recover pAdEi-E3+ 
containing Ad6 and AdS regions. 
25 Figure 1 1 illustrates homologous recombinant to recover a pAdiEi-E3+ 

containing Ad6 liegioiis. 

Figure 12 illustrates a westem blot on whole-cell extracts from i93 
cells transfected with plasnlid DNA expressing different HCV NS cassettes. Mahire I 
NS3 and NS5A products were detected with specific antibodies, "pVlJns-NS" refers 
30 to a pVl JhsA plasmid where a Met-NS3-NS4A-NIS4B-NS5 A-NlSSB polypeptide is 
encoded by SEQ. ID. NO. 5, and SEQ. H). NO. 5 is inserted between biises 1881 ahd 
1912 of SEQ. ID. NO. 7. "pVlJns-NSmut" refers to a pVlJnsA plasmid where SEQ. 
ID. NO. 2 is inserted between bases 1882 and 1925 of SEQ. ID. NO. 7. "pVlJhs- ^^'^ 
NSOPTmut'' refers to a pVlJnsA plasmid where SEQ. ID. NO. 3 is inserted between 
35 bases 1881 and 1905 of SEQ. ID. NO. 7. 
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Figures 13A and 13B illustrate T cell responses by WNy ELIspot 
induced in C57black6 mice (A) and BalbC mice (B) by two injections of 2Sjxg and 
SO^g» respectively, of plasmid DNA encoding the different HCV NS cassettes with 
Gene Electro-Transfer (GET). 
5 Figure 14 illustrates protein expression from different adenovectors 

upon infection of HeLa cells. MRKAdS-NSmut is an adenovector based on an AdS 
sequence (SEQ. ID. NO. 9), where the AdS genome has an El deletion of base pairs , 
451 to 3510, an E3 deletion of base pairs 28134 to 30817, and has the NS3-NS4A- 
NS4B-NS5A-NS5B expression cassette as provided in base pairs 451 to 7468 of SEQ. 
10 ID. NO. 4 inserted between positions 450 and 3511. Ad5-NS is an adenovector basfed 
on an Ad5 backbone with an El deletion of base pairs 342 to 3523, and E3 deletion of 
base pairs 28134 to 30817 and containing an expression cassette encoding a NS3- 
NS4A-NS4B-NS5A-NS5B from SEQ, ID. NO. 5. "MRKAd6-NSOPTmut'* refera to 
an adenovector having a modified SEQ. ID. NO. 4 sequence, wherein base pairs 1258 
15 to 7222 of SEQ. ID. NO. 4 is replaced with SEQ. ID. NO. 3. 

Figure 15 illustrates T cell responses by IFNy ELIspot induced in 
C57black6 mice by two injections of 10^ vp of adenovectors containing different 
HCV non-structural gene cassettes. 

Figures 16A-16D illustrate T cell responses by IFNy ELIspot induced 
20 in Rhesus monkeys by one or two injections of 10^^ vp (A) or 10*^ vp (B) of ^ 
adenovectors containing different HCV non-structural gene cassettes. 

Figures 17 A and 17B illustrates CD8+ T cell responses by BFNy ICS 
induced in Rhesus monkeys by two injections of 10^^ vp (A) or 10^^ vp (B) of 
adenovectors encoding the different HCV non-structural gene cassettes. 
25 Figures 18A-18F illustrate T cell responses by bulk CTL assay induced 

in Rhesus monkeys by two injections of lO" vp of Ad5-NS (A), MRKAdS-NSmut 
(B), or MRKAd6-NSmut (C). 

Figure 19 illustrates the plasmid pE2. 

Figures 20A-D illustrates the partial codon optimized sequence 
30 NSsuboptmut (SEQ. ID. NO, 10). Coding sequence for the Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide is from base 7 to 5961. 
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OETAil£D DES(M^ON OF THE INVEhrnON 

The present invention features Ad6 vectors and nucleic acid encoding ^ 
Met-NS3-lSfS4A<NS4B-NS5A-NS5B polypeptide that contains an inactive NS5B 
region. Providing an inactive NS5B region supplies NS5B antigens while reducing the 
5 possibility of adver$e side effects due to an active viral RNA piolymerase. tlsjes of the 
featured nucleic acid include use as a vaccine component to introduce into a cell an 
HCV polypeptide that provides a broad range of antigens for generating a CMI 
response against HCV, and as an intermediate for producing such a vaccine 
component. 

10 The adaptive cellular immune response can function to recognize viral 

antigens in HCV infected cells throughout the body due to the ubiquitous distribution 
of major histocompatibility complex (MHC) class I and II expression, to induce 
immunological memory, and td maintain immunological memory. These fiAictions 
ate attributed to antigen-specific CD4+ T helper (Th) eaid CD8+ cytotoxic T cells 

15 (CTL). 

Upon activation via their specific T cell receptors* HGV specific Th 
cell^ fulfill a vmiety of inimunoregulatory functions, most of them mediated by Thl 
and Th2 cytokines* HCV specific Th cells assist in the activation and differentiation ^ 
of B cells and induction and stimulation of virus-specific cytotoxic T cells. Together 
20 with CTL, Th cells may also secrete IFN-y and TNF-a that inhibit replication arid ; 
gene expression of several viruses. Additionally, Th cells and CITLi the main effectdr 
cells, can induce apoptosis and lysis of virus infected cells. 

HCV specific CTL are generated from antigens processed by 
professional antigen presenting cells (pAPCs). Antigens can be either synthesized 
25 within or introduced into pAPCs. .Antigen synthesis in a pAPC can be brought about 
by introducing into the cell an expression cassette encoding the antigen^ 

A preferred route of nucleic acid vaccine adniinistration is an 
intramuscular route. Intramuscular administration appears to result in the introduction 
and expression of nucleic acid into somatic cells and pAPCs. HCV antigens produced 
30 in the somatic cells can be transferred to pAPCs for presentation in the context of 
MHC class I molecules. (Donnelly et al, Armw Rev. Immunol /5:6 17-648, 1997.) 

pAPGs process longer length antigens into smaller peptide antigens in 
the proteasbme complex. The antigen is translocated into the endoplasmic \ . 

reticulum/Golgi complex secretory pathway for association with MHC class I 
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proteins. CD8+ T lymphocytes recognize antigen associated with class I MHC via the 
T cell receptor (TCR) and the CD8 cell surface protein. 

Using a nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide as a vaccine component allows for production of a broad range of 
5 antigens capable of generating CMl responses from a single vector. The polypeptide 
should be able to process itself sufficiently to produce at least a region corresponding 
to NS5B. Preferred nucleic acids encode an amino acid sequence substantially similar 
to SEQ. ID. NO. 1 that has sufficient protease activity to process itself to produce 
individual HCV polypeptides substantially similar to the NS3, NS4A, NS4B, NS5A, 

10 and NS5B regions present in SEQ. ID. NO. 1. 

A polypeptide substantially similar to SEQ. ID. NO. 1 with sufficient 
protease activity to process itself in a cell provides the cell with T cell epitopes that 
are present in several different HCV strains. Protease activity is provided by NS3 and 
NS3/NS4A proteins digesting die Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide at 

15 the appropriate cleavage sites to release polypeptides corresponding to NS3, NS4A, 
NS4B, NS5A.andNS5B. Self- processing of tiieMet-NS3-NS4A-NS4B-NS5A- 
NS5B generates polypeptides that approximate naturally occurring HCV polypeptides. 

Based on the guidance provided herein a sufficiently strong immune 
response can be generated to achieve beneficial effects in a patient. The provided 

20 guidance includes information concerning HCV sequence selection, vector selection, 
vector production, combination treatment, and administration. 

I. HCV SEQUENCES 
A variety of different nucleic acid sequences can be used as a vaccine 
25 component to supply a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide to a 
cell or as an intermediate to produce vaccine components. The starting point for 
obtaining suitable nucleic acid sequences are preferably naturally occurring NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide sequences modified to produce an inactive 
NS5B. 

30 The use of a HCV nucleic acid sequence providing HCV non-structural 

antigens to generate a CM! response is mentioned by Cho et al.. Vaccine 17:1 136- 
1 144, 1999, Paliard et al. International Publication Number WO 01/30812 (not 
admitted to be prior art to the claimed invention), and Coit et ah. International 
Publication Number WO 01/38360 (not admitted to be prior art to the claimed 

35 invention). Such references fail to describe, for example, a polypeptide that processes 
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itself to produce an inactive NS5B; and the particular combinations of HCV - 
sequences and delivery vehicles employed herein, ^ ^ v> 

Modifications to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide sequence can be produced by altering the encoding nucleic acid. 
S . Alterations can be perfomied to create deletions, insertions atid substitutidns. 

Small modifications can be made in NSSB to produce an inactive 
polymerase by targeting motifs esseiitially for replication. Examples of niotifs critical • 
for NSSB activity and modifications that can be made to produce an inactive NS5B 
are described by Lohmann et al., Journal of Virology 7i: 84 16-8426, 1997, and 
10 KolylOiiaiowetaL, Journal of Virology 74:2046^20^ 

Additional factors to take into account when producing modifications 
to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide include maintaining the : 
ability to self-process and maintaiining T cell antigens. The ability of the HGV 
polypeptide to process itself is determined to a large extent by a functional NS3 
IS protease. Modifications that maintain NS3 activity protease activity cm be obtained 
by taking into account the NS3 protein, NS4A which serves as a cofactor for NS3, arid 
NS3 protease recognition sites present within the NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide. 

Different modifications can be made to naturally occurring NS3- 

20 NS4A-NS4Bf-NS5A-NS5B polypeptide sequences to prbdu^ 

elicit a broad range of T ceU responses/ Factors influencing the abiUty of 
polypeptide to elicit a broad T cell response include the preservation or introduction 
of HCV specific T cell antigen regions and prevalence of different T cell antigen " ' 
regions in different HCV isolates. 

25 Numerous examples of naturally occiirring HCV isolates 

known in ttie art. HCV isolates can be classified into the fpll6wirig six major 
genotypes comprising one or more subtypes: HCV-l/(la,ib,lc), HGV^2/(2ai2b,2c), 
HCV-3/(3a,3b,10a), HCV-4/(4a). HCV-5/(5a) and HCV-6/(6a,6l>,^b,8b;9aa la). ^ ; 
(Simmonds, J. Gen. Virol, 693-712, 2001.) Examples of particular HCV sequenceis 

30 such as HCV-BK, HCV-J, HCV-N^ HCV-H, have been deposited inGenBank and; 
described in various publications. (See, for example. Chamberlain a/.; 7. Gen. 
V7>-oZ.. 1341-1347, 1997.) 

HCV T cell antigens can be identified by, for example, empirical 
experimentation. One way of identifying T cell antigens involves generating a series 

35 of overlapping short peptides from a longer length polypeptide and then screening the 
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T-cell populations from infected patients for positive clones. Positive clones are 
activated/primed by a particular peptide. Techniques such as IFNy-ELISPOT, IFNy- 
Intracellular staining and bulk CTL assays can be used to measure peptide activity. 
Peptides thus identified can be considered to represent T-cell epitopes of the 
5 respective pathogen. 

HCV T cell antigen regions from different HCV isolates can be 
introduced into a single sequence by, for example, producing a hybrid NS3-NS4A- 
NS4B-NS5A-NS5B polypeptide containing regions from two or more naturally 
occurring sequences. Such a hybrid can contain additional modifications, which 
10 preferably do not reduce the ability of the polypeptide to produce an HCV CMI 
response. 

The ability of a modified Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide to process itself and produce a CMI response can be determined using 
techniques described herein or well known in the art. Such techniques include the use 
15 of lENy-ELISPOT, IPNY-Intracellular staining and bulk CTL assays to measure a 
HCV specific CMI response. 

A. Met-NS3-NS4A-NS4B-NS5A-NS5B Sequences 

SEQ. ID. NO. 1 provides a preferred Met-NS3-NS4A-NS4B.NS5A- 

20 NS5B sequence. SEQ, ID. NO. 1 contains a large number of HCV specific T cell 

antigens that are present in several different HCV isolates. SEQ. ID. NO. 1 is similar 
to the NS3-NS4A-NS4B-NS5 A-NS5B portion of the HCV BK strain nucleotide 
sequence (GenBank accession number M58335). 

In SEQ. ID. NO. 1 anchor positions important for recognition by MHC 

25 class I molecules are conserved or represent conservative substitutions for 18 out of 
20 known T-cell epitopes in tfie NS3-NS4A-NS4B-NS5A-NS5B portion of HCV 
polyproteins. With respect to the remaining two known T-cell epitopes, one has a 
non-conservative anchor substitution in SEQ. ID. NO, 1 that may still be recognized 
by a different HLA supertype and one epitope has one anchor residue not conserved. 

30 HCV T-cell epitopes are described in Chisari et al, Curr Top. Microbiol Immunol, 
242:299-325, 2000, and Lechner et al. 7. Exp. Med. 9: 1499-1512, 2000. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5 A-NS5B 
nucleotide sequence and SEQ. ID. NO. 1 include the introduction of a methionine at 
the 5' end and the presence of modified NS5B active site residues in SEQ. ID. NO. 1. 
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The modification replaces GlyAspAsp with AlaAIaGly (residues 1711-1713) to 

inactivate NSSB. 

The encoded HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polype 

preferably has an amino acid sequence substantially similar to SEQ. ID. NO. 1. In 
5 different embodiments, the encoded HCVMet-NS3-NS4A-N^ 

polypeptide has an amino acid identify to SEQ; ID. NO. 1 of at least 65%, at least 

75%, at least 85%, at least 95%, at least 99% or 100%; or differs from SEQ. ID. NO. 

1 by 1-2, 1-3, 1-4, 1-5, 1-6, 1^7. 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1- 

17, 1-18, 1-19, or 1-20 amino acids, 
10 Amino acid differences between a Met-NS3-NS4A-NS4B-NS5 A- 

NS5B polypeptide and SEQ. ID. NO. 1 are calculated by determining the minimum 

number of amino acid modifications in which the two sequences differ. Amino acid : 

modifications can be deletions, additions, substitutions or any combination thereof. 

Amino acid sequence identity is determiried by methods well known in 
15 the ait that compare the amino acid sequence of one polypeptide to the amino acid 

identity is calculated from the aligmrient by counting the number of aligned residue 
pairs that have identical amino acids. 

Methods for determining sequence identity include those described % 

20 Schulei", G.D. in Bioinfortnatics: A Practical Guide to the Analysis of Genes and 

Proteins, Baxevanisv A.D; and Ouelette^ B.F.F. * eds.^ John Wiley & Sons, Inc, 2001; 
Ydtla:, etaL, in Bioinformatics: Sequence, structure arid databanks^ Miggins, I), and / 
Taylor, W. eds, Oxford University Press, 2000; arid Bioinformatics: Sequence and 
Genome Analysis, Mount, D.W., ed., Cold Spring Harbor Laboratory Press, 2001). 

25 Methods to determine amino acid sequence identity are codified in publicly available 
computer programs such as GAP (Wisconsin Package Version 10.2, Gbriiettcs 
Computer Group (GCG), Madison^ Wisc.)^ BLAST (Altschul et aL, 7. MoL Biol 
2i5(3):403'10, 1990), and FASTA qPearson, Methods in Enzyniotogy iS3:63-98, 
1990, R.F. Doolittle, ed.). 

30 In an embodiment of the present invention sequence identity between* 

tSvd polypeptides is determined using the GAP program (Wisconsin Package Version 
l0.2j Getieties Computer Group (GCG), Madison, Wisc.)^ GAP uses die alignment 
method of Needleman and Wurisch. (Needleman, et al, J. MoL Biol 45:443-453, 
19700 GAP considers all possible alignments and gap positions between two 

35 sequences and creates a global alignment that maximizes the number of matched - ^ 
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residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are lequired to limit the insertion of gaps into the alignment. 
Default program parameters for polypeptide comparisons using GAP are the 
5 BLOSUM62 (Henikoff er a/., Proc. Natl Acad. ScL USA, 59:10915-10919, 1992) 
amino acid scoring matrix (MATrix=blosum62xmp), a gap creation parameter 
(GAPweight=8) and a gap extension pararameter (LENgthweight=2). 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptides in addition to being substantially similar to SEQ. ID. NO. 1 across their 
10 entire length produce individual NS3, NS4A, NS4B, NS5A and NS5B regions tiiat are 
substantially similarto the corresponding regions present in SEQ. ID. NO. 1. The 
corresponding regions in SEQ. ID. NO. 1 are provided as follows: Met-NSS amino 
acids 1-632; NS4A amino acids 633-686; NS4B amino acids 687-947; NS5A amino 
acids 948-1394; and NS5B amino acids 1395-1985. 
15 In different embodiments a NS3, NS4A. NS4B, NS5A and/or NS5B 

region has an amino acid identity to the corresponding region in SEQ. ID. NO. 1 of at 
least 65%, at least 75%, at least 85%, at least 95%, at least 99%, or 100%; or an 
amino acid difference of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9. 1-10, 1-11, 1-12, 1-13, 
1-14, 1-15, 1-16, 1-17, 1-18, 1-19, or 1-20 amino acids. 
20 Amino acid modifications to SEQ. ID. NO. 1 preferably maintain all or 

most of the T-cell antigen regions. Differences in naturally occurring amino acids are 
due to different amino acid side chains (R groups). An R group affects different 
properties of the amino acid such as physical size, charge, and hydrophobicity. 
Amino acids can be divided into different groups as follows: neutral and hydrophobic 
25 (alanine, valine, leucine, isoleucine, proline, tyrptophan, phenylalanine, and 
methionine); neutral and polar (glycine, serine, threonine, tryosine, cysteine, 
asparagine, and glutamine); basic Gysine, arginine, and histidine); and acidic (aspartic 
acid and glutamic acid). 

Generally, in substituting different amino acids it is preferable to 
30 exchange amino acids having similar properties. Substituting different amino acids 
within a particular group, such as substituting valine for leucine, arginine for lysine, 
and asparagine for glutamine arc good candidates for not causing a change in 
polypeptide tertiary structure. 

Starting with a particular amino acid sequence and the known 
35 degeneracy of the genetic code, a large number of different encoding nucleic acid 
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sequences can be obtained. The degeneracy of the genetic code arises because almost 

all amino acids are encoded by different combinations of nucleotide triplets or 

"eodons". The translation of a particiil^ codon into a particular amino acid is well 

known in the art (see, e.g.^ Lewin GENES IV, p. 1 19, Oxford University Pre^ss, 1990). 
5 ^ Amino acids are encoded by codons as follows: 

A=Ala=Alahine: codons GCA, GCC, GCG, GCU 

C=Cys=Cystdne2 codons UGC, UGU 

D=Asp=Aspartic acid: codons GAC, GAU 

E=GlU==Glutamic acid: codons GAA, GAG 
.10 F=Phe=Phenylalamhe: codons UUC, UUU 

G=Gly=Glycine: codons GGA, GGC, GGG, GGU 

H=His=Histidine: codons GAG, CAU 

. I=Ile=:Isoleucine: codons AUA, AUG, AUU 

K=Lys==Lysine: codons AAA, AAG 
15 L=Leu=Leucine: codons XJUA^ UUG. CUA, CUC. CUG. CUU 

M=:Met=Methionine: qodoti AUG 

N=Asri==Asparagirie: codons AAC, AAU 

P=Pfd==Pfoline: codons CCA, CCC, CCG* CGU 

Q=Gln=Glutamirie: codons CAA, GAG 
20 R=Arg==Arginirie: codons AGA,AGG, GGA vCGCvCGG,GG 

S=Ser=:Serine: codons AGC, AGU, UCA, UCC, UGG, UGU ^ ^ _ ^ 

T=Thr=:Threonine: codons ACA, ACG, ACG, AGU 

V=Val=VaIirie: codons GUA, GUC, GUG* GUU 

W=Trp=Tryptophan: codon UGG 
25 Y==Tyr=Tyfosiriet codons UAC 

Nucleic acid sequences cah be optimized in an effort to enhance 

expression in a host. Factors to be considered include C:G content^ preferred cod6n$^ 

and the avoid^Lnce of inhibitory secondary structure. The^e faictors can be combined 

in different ways ini an attempt to obtain nucleic acid sequences having enhanced 
30 expression in ai particular host. (5ee, for example, Donnelly et at^j International ^ 

Publication Number WO 97/47358.) 

The ability of a particular sequence to have enhanced expression in a ; 

particular host involves some empirical experinnientationi Such experimentation 

involves measuring expression of a prospective nucleic acid sequence and, if needed, 
35 altering the sequence. 
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B, Encoding Nucleotide Sequences 

SEQ. ID. NOs, 2 and 3 provide two examples of nucleotide sequences 
encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B sequence. The coding sequence of 
5 SEQ. ro. NO. 2 is similar (99.4% nucleotide sequence identity) to the NS3-NS4A- ' 
NS4B-NS5 A-NS5B region of the naturally occurring HC V-BK sequence (GenBank 
accession number M58335). SEQ. ID. NO. 3 is a codon-optimized version of SEQ. 
ID. NO. 2, SEQ. ID. NOs. 2 and 3 have a nucleotide sequence identity of 78.3%, 
Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
10 nucleotide (GenBank accession number M58335) and SEQ. ID. NO. 2, include SEQ. 
ID. NO. 2 having a ribosome binding site, an ATG methionine codon, a region coding 
for a modified NS5B catalytic domain, a TAAA stop signal and an additional 30 
nucleotide differences. The modified catalytic domain codes for a AlaAlaGly 
(residues 171 1-1713) instead of GlyAspAsp to inactivate NS5B. 
15 A nucleotide sequence encoding a HCV Met-NS3-NS4A-NS4B- 

NS5A-NS5B polypeptide is preferably substantially similar to the SEQ. ID. NO. 2 
coding region. In different embodiments, the nucleotide sequence encoding a HCV 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide has a nucleotide sequence identify 
to the SEQ, ID. NO. 2 coding region of at least 65%, at least 75%, at least 85%, at 
20 least 95%, at least 99%, or 100%; or differs from SEQ. ID. NO, 2 by 1-2, 1-3, 1-4, 1- 
5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 
1-25, lr30. 1-35, 1-40, 1-45, or 1-50 nucleotides. 

Nucleotide differences between a sequence coding Met-NS3-NS4A- 
NS4B-NS5A-NS5B and the SEQ. ID. NO. 2 coding region are calculated by 
25 determining the minimum number of nucleotide modifications in which the two 

sequences differ. Nucleotide modifications can be deletions, additions, substitutions 
or any combination thereof. 

Nucleotide sequence identity is determined by methods well known in 
the art that compare the nucleotide sequence of one sequence to the nucleotide 
30 sequence of a second sequence and generate a sequence alignment. Sequence identity 
is determined from the alignment by counting the number of aligned positions having 
identical nucleotides. 

Methods for determining nucleotide sequence identity between two 
polynucleotides include those described by Schuler, in Bioinformatics: A Practical 
35 Guide to the Analysis of Genes and Proteins, Baxevanis, A.D. and Ouelette, B.F.F., 
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eds., John Wiley & Sons, Inc, 2001; Yona et a/.,, in Bioinformatics: Sequence, 
structure arid databanks, Higgins, D. and Taylor, W. eds, Oxford University Press, 
2000; aiid Bioinformatics: Sequence and Genome Analysis, Mduiit, D.W., ed.. Cold 
Spring Harbor Laboratory Press, 2001). Methods to determine nucleotide sequence 
5 identity are codified in publicly available computer programs such as GAP 

(Wisconsin Package Version 10.2^ Genetics Computer Group (GCG), Madiison, 
Wise), BLAST (Altschul et al., J. Mol Biol 215(3yA03-lQi 1990), and PASTA 
(Pearson, W.R., Methods in Enzymology 755:63-98, 1990, R,R Doolittle, ed.). 

Iri an embodiment of the present ivnention, sequence identity between 

10 two polynucleotides is deterrnined by application of GAP (Wisconsin Package 
Version 10.2, Genetics Computer Group (GCG), Madison, Wise.)- GAP uses the 
alignment method of Needleman and Wunsch. (Needleman et al, J. Mol Biol 
45:443-453, 1970.) GAP considers all possible alignments and gap positions betweeh 
two sequences and creates a global alignirient that maximizes the number of matched 

IS residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
exte^nslon penalty are required to limit the insertion of gaps into the alignme^nt. 
Default progitun parameters for polynucleotide comparisons using GAP are the 
nwsgapdna.cmp scoring matrix (MATrix==nwsgapdna.cmp), a gap creation parameter 

20 (GAPweight=50) and a gap extension pararianieter (LENgth weight=3); 

More preferred HC V Met-NS3.NS4A-NS4B-NS5 A-NS5B nucleotide 
sequences in addition to being substantially similar across its entire length, produce 
individual NS3, NS4A, NS4B^ NS5A and NSSB regions that are substantially similar 
to the corresponding regions present in SEQ. ID. NO. 2. The corresponding coding 

25 - regions in SEQ. ID. NO. 2 are provided as follows: Met-NS3, nucleotides 7-1902r ' 
NS4A nucleotides 1903-2064; NS4B nucleotides 2065-2847; NSSA nucleotides 
2848-4188: NS5B nucleotides 4189-5661. 

In different embodiments a NS3, NS4A, NS4B, NS5A and/or NS5B ; 
encoding region has a nucleotide sequence identity to the corresponding region in 

30 SEQ. ID. NO. 2 of at least 65%, at least 75%, at least 85%, at least 95%/at least 99%: 
of 100%; or a nucleotide difference to SEQ. ID. NO. 2 of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 
1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, M5, 1-16, 1-17, 1-18, 1-19, >20, 1-25, 1-30. ^ 
1-35, 1-40, 1-45. or 1-50 nucleotides. 

35 .- - ' . ^' 
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C. Gene Expression Cassettes 

A gene expression cassette contains elements needed for polypeptide 
expression. Reference to "polypeptide" does not provide a size limitation and 
includes protein. Regulatory elements present in a gene expression cassette generally 
5 include: (a) a promoter transcriptionally coupled to a nucleotide sequence encoding 
the polypeptide, (b) a 5' ribosome binding site functionally coupled to the nucleotide 
sequence, (c) a terminator joined to the 3' end of the nucleotide sequence, and (d) a 3' 
polyadenylation signal functionally coupled to the nucleotide sequence. Additional 
regulatory elements useful for enhancing or regulating gene expression or polypeptide 
10 processing may also be present. 

Promoters are genetic elements that are recognized by an RNA 
polymerase and mediate transcription of downstream regions. Preferred promoters 
are strong promoters that provide for increased levels of transcription. Examples of 
strong promoters are the immediate early human cytomegalovirus promoter (CMV), 
15 and CMV with intron A. (Chapman et al Nucl Acids Res. 19:3979-3986, 1991.) 
Additional examples of promoters include naturally occurring promoters such as the 
EFl alpha promoter, the murine CMV promoter, Rous sarcoma virus promoter, and 
S V40 early/late promoters and the p-actin promoter; and artificial promoters such as a 
synthetic muscle specific promoter and a chimeric muscle-specific/CMV promoter (Li 
20 et aU Nca. BiotechnoL 77:241-245, 1999, Hagstrom et al.. Blood 95:2536-2542, 
2000). 

The ribosome binding site is located at or near the initiation codon. 
Examples of preferred ribosome binding sites include CCACC AUGG, 
CCGCCAUGG, and ACCAUGG, where AUG is the initiation codon. (Kozak, Cell 
25 44:2S3'292, 1986). Another example of a ribosome binding site is GCCACCAUGG 
(SEQ. ID. NO. 12). 

The polyadenylation signal is responsible for cleaving the transcribed 
RNA and the addition of a poly (A) tail to the RNA. The polyadenylation signal in 
higher eukaryotes contains an AAUAAA sequence about 11-30 nucleotides from the 
30 polyadenylation addition site. The AAUAAA sequence is involved in signaling RNA 
cleavage. (Lewin, Genes IV, Oxford University Press, NY, 1990.) The poly (A) tail 
is important for the mRNA processing. 

Polyadenylation signals that can be used as part of a gene expression 
cassette include the minimal rabbit p -globin polyadenylation signal and the bovine 
35 growth hormone polyadenylation (BGH), (Xu et al, Gene 272: 149-156, 2001, Post et 
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al, U.S. Patent U. S. 5,122,458.) Additional examples include the Synthetic 
Polyadenylation Signal (SPA) and SV40 polyadenylation signal. The SPA sequence - 
is as follows: AAUAAAAGAUCUUUAUUUUCAUUAGAUCUGUGUG . 
UUGGUUUUUUGUGUG (SEQ, ID, NO. 13). 

5 Examples of additional regulatory elements useful for enhancing or 

regulating gene expression or polypeptide processing that may be present include an 
enhancer, a leader sequence and an operator. An enhancer region increases 
transcription. Examples of enhancer regions include the CMV enhancer and the 
S V40 enhancer. (Hitt et aL, Methods in Molecular Genetics 7: 13-30/ 1995, Xu, et al, 

10 Gene 272\ 149-156, 2001.) An enhancer region can be associated with a promoter. 

A leader sequence is an amino acid region on a polj^eptide that directs 
the polypeptide into the proteasome. Nucleic acid encoding the leader sequence is 5' 
of a structural gene and is transcribed along the structural gene. An example of a 
leader sequences is tP A, 

15 An operator sequence can be used to regulate gene expression. For 

example, the Tet operator sequence can be used to repress gene expression. 

n. THERAPEUTIC VECTORS 
Nucleic acid encoding a Met^NS3-NS4A-NS4B-NS5A-NS5B 
20 polypeptide can be introduced into a patient using vectors suitable for therapeutic: 
administration. Suitable vectors can deliver nucleic acid into a target cell without 
causing ah unacceptable side effect. 

Cellular expression is achieved using a gene expression cassette 
encoding a Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide. The gene expression 
25 cassette contains regulatory elements for producing and processing a sufficient 
amount of nucleic acid inside a target cell to achieve a beneficial effect. 

Examples of vectors that can be used for therapeutic applications 
include first and second generation adenovectors^ helper dependent adetiovectors, 
adeno^associated viral vectors, retroviral vectors* alpha virus vectors, Venezuelan 
30 Equine Encephalitis virus vector, and plasmid vectors. (Hitt. et dl„ Advances in 

Pharmacology -^0:137-206, 1997, Johnston et aL, U.S. Patent No. 6,156,588; and - 
Johnston et al.. International Publication Number WO 95/32733.) Preferred vectors 
for introducing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide into a subject are 
first generation adenoviral vectors and plasmid DNA vectors. 
35 • . . - 
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A. First Generation Adenovectors 

First generation adenovector for expressing a gene expression cassette 
contain the expression cassette in an El and optionally E3 deleted recombinant 
adenovirus genome. The deletion in the El region is sufficiently large to remove 
5 elements needed for adenoviral replication. 

First generation adenovectors for expressing a Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide contain a El and E3 deleted recombinant adenovirus 
genome. The deletion in the El region is sufficiently large to remove elements 
needed for adenoviral replication. The combinations of deletions of the El and E3 
10 regions are sufficiently large to accommodate a gene expression cassette encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

The adenovirus has a double-stranded linear genome with inverted 
terminal repeats at both ends. During viral replication, the genome is packaged inside 
a viral capsid to form a virion. The virus enters its target cell through viral attachmeiit 
15 followed by internalization. (Hitt et al.. Advances in Pharmacology 40: 137-206, 
1997,) 

Adenovectors can be based on different adenovirus serotypes such as 
those found in humans or animals. Examples of animal adenoviruses include bovine, 
porcine, chimp, murine, canine, and avian (CELO). Preferred adenovectors are based 
20 on human serotypes, more preferably Group B, C. or D serotypes. Examples of 

human adenovirus Group B, C, D, or E serotypes include types 2 C*Ad2"), 4 ("Ad4"), 
5 ("Ad5"). 6 ("Ad6"), 24 C*Ad24"), 26 ("Ad26''), 34 ("Ad34^') and 35 C'Ad35"). 
Adenovectors can contain regions from a single adenovirus or from two or more 
adenovirus. 

25 In different embodiments adenovectors are based on Ad5, Ad6, or a 

combination thereof. Ad5 is described by Chroboczek, et al., J. Virology 186:2^0- 
285, 1992. Ad6 is described in Figures 7A-7N, An Ad6 based vector containing 
Ad5 regions is described in the Example section provided below. 

Adenovectors do not need to have their El and E3 regions completely 

30 removed. Rather, a sufficient amount the El region is removed to render the vector 
replication incompetent in the absence of the El proteins being supplied in trans; and 
the El deletion or the combination of the El and E3 deletions are sufficiently large 
enough to accommodate a gene expression cassette. 

El deletions can be obtained starting at about base pair 342 going up to 

35 about base pair 3523 of Ad5, or a corresponding region from other adenoviruses. 
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Preferably, the deleted region involves removing a region from about base pair 450 to 
about base pair 351 1 of Ad5, of a corresponding region from other adenoviruses. 
Larger El region deletions starting at about base pair 341 removes elements that 
facilitate virus packaging. 
5 E3 deletions caii be obtained starting at about base p£dr 27865 to a£>6ut 

baise pair 30995 of Ad5, or the corresponding region of other adenovectors. 
Preferably the deletion region involves removing a region from about base pair 28134; 
up to about base pair 30817 of Ad5. or the corresponding region of other 
adenovectors, 

10 The conibination of deletions to the El region and optionally the E3 

region should be sufficiently large so that the overall size of the recombinant genome 
containing the gene exjpression cassette does not exceed about 105% of the wild type 
' • adenovirus genbmei For exaniiple, as recombinant adenovirus Ad5 genomes increlEfee 
size above about 105% the genome becomes unstable. (Bett et al. Journal of 

15 Vfrotegy 57:5911-5921, 1993.) 

Preferably, the size of the recombinant adenovirus genome containing 
the geiie expression cassette is about 85% to about 105% the size of the wild type 
adenovirus genome. In different embbdimeiitis, the size of the rbdombitlant 
adenovirus genome containing the expression cassette is about 100% to about 

20 105.2%, or about 100%, the size of the 

Approximately 7,500 kb can be inserted into ail adeiibvirus genome 
with a El and E3 deletion. Without any deletion, the Ad5 ^nonle is 35,935 base 
pairs aild the Ad6 genome is 35,759 base pairs. 

Replication of first generation adenovectors can be performed by 

25 supplying the El gene products in trans. The El gene product can be supplied in 

trans, for example, by using cell lines that have been transformed with the adenovirus 
El region. Examples of cells and cells lines transformed with the adenovirus El 
region are HEK 293 cells, 911 cells; PmiC.6™ cells, and tfansfected primary humaft 
aminbcytes ceils. (Gr^am et al. Journal of Virology 36:59'72, 1977, Schiedner er 

30 al.. Human Gene Therapy 77:2105-21 16, 2000, Fallaux et al. Human Gene Therapy , 
9:1909-1917, 1998, Bout et al., U.S. Patent No. 6.033,908.) 

A Met-NS3-NS4A-NS4B-NS5 A-NS5B expression cassette should be 
inserted into a recombinant adenovirus genome iii the region coirespondirig to the 
deleted El region or the deleted E3 region. The expression cassette can have a 

35 parallel or anti-parallel orientation. In a parallel orientation the transcription direction 

21 
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of the inserted gene is the same direction as the deleted El or E3 gene. In an anti- 
parallel orientation transcription the opposite strand serves as a template and the 
transcription direction is in the opposite direction. 

In an embodiment of the present invention the adeno vector has a gene 
I 5 expression cassette inserted in the El deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base ; 
pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-paratllel 
orientation joined to the first region; 

10 c) a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pait 
5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

15 28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the third region; and 

f) a fifth adenovirus region from about base pair 33967 to about 
20 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6 joined to the fourth region. 

In another embodiment of the present invention the adenovector has an 
expression cassette inserted in the E3 deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
25 pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 
30 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 
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e) a fourthadeno vims region from about base pair 308 18 to abouit 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the gene eXpressiorii cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
5 baise pair 35935 coriespotiding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6 Joined to the fourth reg^^ 

to preferred different embodiiilBntS concerning adenovirus regions that 
are present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) 
the first, second, third, fourth, and fifth region corresponds to Ad6; and (3) tiie first 
10 region corresponds to Ad5^ the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

B. DNA Plasmid Vectors 
IS DNA Vaccine plasmid vectors contain a gene expression cassette along 

With elenients facilitating replication and preferably vector selectioiK. Preferred 

elements provide for replication in non^maxiimalian cells and a selectable marker. 

The vectors should not contain elements providing for replication in human cells 6r 

for integration into huhian nucleic acid. 
20 The selectable marker facilitates selection of nucleic acids cpntsdning 

the marker. Preferred selectable markers are those that confer antibid 

Examples of antibiotic selection genes include nucleic acid encoding resistance to 

ampicillin, neomycin, and kail amycin. 

Suitable DNA vaccine vectors can be produced starting with a plasmid 
25 containing a bacterial origin of replication arid a selectable marker. Examples of 

bacterial origins of replication providing for higher yields include the ColEl plasmid- 

derived bacterid origin of replication. (Donnelly et al.^ Anfiu. Rev. ImmunoL 15:617- 

648, 1997.) 

The presence of the bacterial origin of replication and selectable 
30 marker allows for the production of the DNA vector in a bacterial strain such as E, 
coll. The selectable m£ifker is used to eliminate bacteria not containing the DNA 
vector. 
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m. AD6 RECOMBINANT NUCLEIC ACID 
Ad6 recombinant nucleic acid comprises an Ad6 region substantially 
similar to an Ad6 region found in SEQ. ID, NO. 8, and a region not present in Ad6 
nucleic acid. Recombinant nucleic acid comprising Ad6 regions have different uses \ 
5 such as in producing different Ad6 regions, as intermediates in the production of Add 
based vectors^ and as a vector for delivering a recombinant gene. 

As depicted in Figure 9, the genomic organization of Ad6 is very 
similar to the genomic organization of Ad5. The homology between Ad5 and Ad6 is 
approximately 98%. 

10 In different embodiments, the Ad6 recombinant nucleic acid comprises 

a nucleotide region substantially similar to El A, ElB, E2B, E2A, E3, E4, LI, L2, 13, 
or lA, or any combination thereof. A substantially similar nucleic acid region to an 
Ad6 region has a nucleotide sequence identity of at least 65%, at least 75%, at least 
85%, at least 95%, at least 99% or 100%; or a nucleotide difference of 1-2, 1-3, 1-4, 

15 1-5, 1-6, 1-7, 1-8. 1-9, 1-10, 1-11, 1-12, 1-13. 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 
1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. Techniques and embodiments for 
determining substantially similar nucleic acid sequences are described in Section LB. 
supra. 

Preferably, the recombinant Ad6 nucleic acid contains an expression 
20 cassette coding for a polypeptide not found in Ad6. Examples of expression cassettes 
include those coding for HCV regions and those coding for other types of . 
polypeptides. 

Different types of adenoviral vectors can be produced incorporating 
different amounts of Ad6, such as first and second generation adenovectors. As noted 
25 in Section II.A. supra, first generation adenovectors are defective in El and can 
replicate when El is supplied in trans. 

Second generation adenovectors contain less adenoviral genome than 
first generation vectors and can be used in conjugation with complementing cell lines 
and/or helper vectors supplying adenoviral proteins. Second generation adenovectors 
30 are described in different references such as Russell, Journal of General Virology 
57:2573-2604, 2000; Hitt et aL, 1997, Human Ad vectors for Gene Transfer^ 
Advances in Pharmacology, Vol 40 Academic Press. 

In an embodiment of the present invention, the Ad6 recombinant 
nucleic acid is an adenovirus vector defective in El that is able to replicate when El is 
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supplied in trans. Expression cassettes can be inserted into a deleted El re^on and/or 
a deleted E3 region. 

An example of an Ad6 based adenoviral vector with an expnession 
cassette provided in a deleted El region comprises or Consists of: 
5 a) a first adenovirus region from about base pair 1 to abdUt base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression ciassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 3511 to about 
10 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6 Joined to the explosion cassette; 

d) a third adenovirus region froni about base pair 5549 to about 
base pair 28133 cornesponding to Ad5 or from about base pair 5542 to about base pair 
28156 coixespondirig to Ad6, joined to the second regibn; 

15 e) an optionally present fourth region from about base pair 28134 

to about base pair 308 17 corresponding to Ad5^ or from about base pair 2i8157 to 
about base pair 30788 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base . 

20 pair 33784 corresponding to Ad6v \vheiiein. the fifth region is joined to the fourth 
region if the fourth region is present^ or the fifth is joined to the third regioifi if the 
fourth region is not present; and 

g) a sixth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

25 pair 35759 corresponding to Ad6, joined to the fifth region; 

wherein at least one Ad6 region is present. 
lii different embodiments of the invention^ all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2* 3, or 4 
regions selected ffomi the second, third, fourth, and fifth regions are from Ad6. 
30 An example of an Ad6 based adenoviral vector with an expression 

cassette provided in a deleted E3 region cdrriprises or consists of: 

a) a first adenovirus region frond about base pair 1 to aboiit basC:. 
pair 450 corresponding to either Ad5 of Ad6; 
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b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 
5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; 

wherein at least one Ad6 region is present. 

In different embodiment of the invention, all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 
regions selected from the second, third, fourth and fifth regions are from Ad6. 

?0 IV. VECTOR PRODUCTION 

Vectors can be produced using recombinant nucleic acid techniques 
such as those involving the use of restriction enzymes, nucleic acid ligation, and 
homologous recombination. Recombinant nucleic acid techniques are well known in 
the art.. (Ausubel, Current Protocols in Molecular Biology, John Wiley, 1987-1998, 

25 and Sambrook et al. Molecular Cloning, A Laboratory Manual 2"^ Edition, Cold 
Spring Harbor Laboratory Press, 1989.) 

Intermediate vectors are used to derive a therapeutic vector or to 
transfer an expression cassette or portion thereof from one vector to another vector. 
Examples of intermediate vectors include adenovirus genome plasmids and shuttle 

30 vectors. 

Useful elements in an intermediate vector include an origin of 
replication, a selectable marker, homologous recombination regions, and convenient 
restriction sites. Convenient restriction sites can be used to facilitate cloning or release 
of a nucleic acid sequence. 
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Homologous recombination regions provide nucleic acid siequence 
regions that are homologous to a target region in another nucleic acid molecule. The 
homologous regions flank the nucleic acid sequence that is being insertied into the 
target region. In different embodiments homologous regions are preferably about ISO 
5 to 600 nucleotides in length, or about 100 to 500 nucleotides in length. 

An ernbodiment of the present invention describes a shuttle vector 
containing a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette, a selectable 
marker, a bacterial origin of replication, a first adenovirus homology region and a 
second adenovirus horriologous region that target the expression cassette to insert in 

10 or replace an El region. The first and second homology regions flank the expression 
cassette. The first homology region contains at least about 100 base pairs 
substantially homologous to at least the right end (3' end) of a wild^type adenovirus 
region from about base pairs 4-450. The second homology contains at least about 100 
base pairs substantially homologous to at least the left end (5* end) of Ad5 from about 

15 base pairs 35 1 1-5792, at the corresponding region from another adenovirus. 

Reference to "substantially homologous" indicates a sufficient degree 
of homology to specifically recombine with a target region. In. different embodiments 
substaintially homologous refers to at least 8S%/at least 95%^ or 100% sequence 
. identity. Sequence identity can be calculated as described in Section i,B. suptdu 

20 One method of producing adenovectors is through the creation of aii ' 

adenovirus genome plasmid containing an expression cassette. The pre- Adenovirus 
plasmid contains all the adenovims sequences needed for replication in the desired 
cdrriplimenting cell line. The pre- Adenovirus plasmid is then digested with ai 
restriction enzyme to release the viral ITR's and transfected into the complementing 

25 cell line for virus rescue. The ITR's niust be released from plasmid sequences to 
allow replication to occur. Adeno vector rescue results in the production on an 
adenovector containing the expression cassette. 

A; Adenovirus Genome Plasmids ' ; ' 

30 Adenovirus genome plasmids contain an adenovector seiquence inside 

a longer-length plasmid (which may be a cosmid). The longer-length pl^inid may ; 
contain additional elements such as those facilitating growth and selection in 
eukaryotic or bacterial cells depending upon the procedures employed to produce and 
maintain the plasmid. Techniques for producing adenovirus genome plasmids include 

35 those involving the use of shuttle vectors and homologous recombination, and those/: 
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involving the insertion of a gene expression cassette into an adenovirus cosmid. (Hitt 
et al. Methods in Molecular Genetics 7:13-30, 1995, Danthinne et al. Gene Therapy 
7:1707-1714,2000.) 

Adenovirus genome plasmids preferably have a gene expression 
' 5 cassette inserted into a El or E3 deleted region. In an embodiment of the present 
invention, the adenovirus genome plasmid contains a gene expression cassette 
inserted in the El deleted region, an origin of replication, a selectable marker, and the 
recombinant adenovirus region is made up of : 

a) a first adenovirus region from about base pair 1 to about base 
10 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

15 554 1 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
20 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region, and 

25 g) an optionally present E3 region corresponding to all or part of 

the E3 region present in Ad5 or Ad6, which may be present for smaller inserts taking 
into account the overall size of the desired adenovector. 

In another embodiment of the present invention the recombinant 
adenovirus genome plasmid has the gene expression cassette inserted in the E3 
30 deleted region. The vector contains an origin of replication, a selectable marker, and 
the following: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 3511 to about! 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 cortespdnding to Ad6 Joined to the expression cassette; : v 

c) a third adenovirus region from about base pair 5549 to about 
5 biase pair 28133 corresponding to Ad5 or from about base pmr 5542 to about b^^ 

28 156 coitesponding to Ad6i joined to the second region; 

d) the gene expresision cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third fegiori; 

e) a fourth adenovirus region from about base pair 30818 to abdiit 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corfespotiding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth re^oii. 

15 In different embodiments concerning adeiidSdrus regions that are 

present! (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
firsts second, third, fdurth, aoid fifth region corresponds to Ad6; and (3) the first regidn 
cdiTgsponds td Ad5, the second region corresponds td Ad5 ^ the third region 
cdrrespdnds td Ad6i thfe fourth region corresponds to Ad6, arid the fifth region 

20 corresponds to Ad5. 

An embodiment of the present invention describes a method of making 
an adenovector involving a hdrnologous recombination step to produce a adenovirus 
genome plasrriid and ah adenovirus rescue step. The homologous recombination step 
involves the use of a shuttle vector containing a Met-NS3-NS4A-NS4B-NS5A-NS5B 

25 expression cassette flanked by adenovirus homology regions^ The adenovirus 
homology regions target the expression cassette into either the El or E3 deleted 
region. 

In an embodiment of the present invention concerning the production 
of art adenovirus geridrhe plasmid^ the gene expression cassette is inserted irito a ^ 

30 vector coinprisirig: a fii^t adetidVinis region fi'dm about base pair 1 to abdut base pair 
450 cdiresporiding td either AdS or Ad6; a second adenovirus regidh frdrti about base 
paif 35 1 1 td about base pair 5548 cdfrespdnding to Ad5 or frdiii about biase pair 3508"^ 
td about base pair 5541 cdrrbsponding to Ad6, joiiied to the second region • a third 
adenovirus region frorh about base pair 5549 to about base pair 28133 corresponding 

35 to AdS or from about base pair 5542 to about base pair 28156 corresponding to Ad6, 
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joined to the second region; a fourth adenovirus region from about base pair 30818 to 
about base pair 33966 corresponding to Ad5 or from about base pair 30789 tp about 
base pair 33784 corresponding to Ad6, joined to the third region; and a fifth 
adenovirus region from about 33967 to about 35935 corresponding to Ad5 or fipni 
5 about base pair 33785 to about base pair 35759 corresponding to Ad6, joined to the 
fourth region. The adenovirus genome plasmid should contain an origin of replication 
and a selectable marker, and may contain all or part of the Ad5 or Ad6 E3 region. 

In different embodiments concerning adenovims regions that are 
present: (1) the first; second, third, fourth, and fifth region corresponds to Ad5; (2) the 
10 first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 

corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

15 B. Adenovector Rescue 

An adenovector can be rescued from a recombinant adenovims 
genome plasmid using techniques known in the art or described herein. Examples of 
techniques for adenovirus rescue well known in the art are provided by Hitt et al. 
Methods in Molecular Genetics 7: 13-30, 1995, and Danthinne et al.. Gene Therapy 
20 7:1707-1714,2000. 

A preferred method of rescuing an adenovector described herein 
involves boosting adenoviral replication. Boosting adenoviral replication can be 
performed, for example, by supplying adenoviral functions such as E2 proteins 
(polymerase, pre-terminal protein and DNA binding protein) as well as E4 orf6 on a 
25 separate plasmid. Example 10 infra, illustrates the boosting of adenoviral replication 
to rescue an adenovector containing a codon optimized Met-NS3-NS4A-NS4B- 
NS5A-NS5B expression cassette. 

V. PARTIAI^OPrnMIZED HCV ENCODING SEQUENCES 
30 Partial optimization of HCV polyprotein encoding nucleic acid 

provides for a lesser amount of codons optimized for expression in a human than 
complete optimization. The overall objective is to provide the benefits of increased 
expression due to codon optimization, while facilitating the production of an 
adenovector containing HCV polyprotein encoding nucleic acid having optimized 
35 codons. 
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Complete optimization of an HCV poiyjprotein encoding sequence / ' 
provides the most frequently observed human codon for each amino acid. Gompletie 
optimization can be performed using codon frequency tables well known in the art 
and using programs such as the BACKTRANSLATE program (Wisconsin Package 
5 = version 10, Genetics Computer Group, GCG, Ma^^ 

Partial optimization da^i be pieforraed on 
encoding sequence that is present {e.g i NS3-NS5B), or one of more local regions that are 
present. In different embodiments the GC content for the entire HCV encoded polyprotein 
that is present is no greater than at least about 65%; and the GC content for one or more 
10 local iegions is no greater than about 70%, 

Local regions are regionis present in HCV encoding nucleic acid, and can ^ 
vary in size. For example, local regions can be about 60, about 70, about 80, about 90 or 
about 100 nucleotides in length. 

Partial optimization can be achieved by initially constructing an HCV 
IS encoding polyprotein sequence to be partially optimized based on a naturally ocuning 
sequence. Alternatively, ati optimized HCV encoding sequence can be used as basis of I 
cdmparisoti to produce a partial optiiiiized sequence. 

VI. HCV GOMBINATIQN TREATMENT 

20 The HCV Met-NS3-NS4A-NS4B'NS5 A-NS5B vadcine c^n be used by 

itself to treat a patient, can be used in conjunction with other HCV therapeutics, and . ^ 
can be used with agents targeting other types of diseases. Additiohal therapieutics 
include additional therapeutic agents to treat HCV and diseases having a high 
prevalence in HCV infected persons, Agentis targeting other types of dis6ase include ' ^ ' 

25 vaccines directed against HIV and HB V. 

Additional therapeutics for treating HCV include vaccines aiid non- 
vaccine agents. (Zeiii, EKpert Opin. Investig, Drugs 70:1457^1469* 2001.) Examples ' 
of additional HCV vaccine^ include vacciiies designed to elicit an immune response : 
against an HCV c6re antigen arid the HCV El, E2 or p7 region. Vaccine coinponents ' f 1 

30 can be natiifally dccufring HCV polypeptides, HCV fhihidtope polypeptides or nucleic 
acid encoding such polypeptidesv 

HCV mimotope polypeptides contain HCV epitopes, but haVe a 
different sequence than a naturally occurring HCV antigen. A HCV mimotppe can be : 
fused to a naturally occurring HCV antigen. References describing techniques for . 

35 producing mimotopes in general and describing different HCV niimdtopes are 
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provided in Felici et al. U.S. Patent No. 5,994,083 and Nicosia et al, Intemational 
Application Number WO 99/60132. 

Vn. PHARMACEUTICAL ADMINISTRATION 
HCV vaccines can be formulated and administered to a patient using 
the guidance provided herein along with techniques well known in the art. Guidelines 
for pharmaceutical administration in general are provided in, for example, Modem 
Vaccinology, Ed. Kurstak, Plenum Med. Co. 1994; Remington *s Pharmaceutical 
Sciences 7S"' Edition, Ed. Gennaro, Mack Publishing, 1990; and Modem 
10 Pharmaceutics 2'^ Edition, Eds. Banker and Rhodes, Marcel Dekker, Inc., 1990, each 
of which are hereby incorporated by reference herein, 

HCV vaccines can be administered by different routes such 
intravenous, intraperitoneal, subcutaneous, intramuscular, intrademial, impression 
through the skin, or nasal. A preferred route is intramuscular. 

Intramuscular administration can be preformed using diffetent 
techniques such as by injection with or without one or more electric pulses. Electric 
mediated transfer can assist genetic immunization by stimulating both humoral and 
cellular immune responses. 

Vaccine injection can be peiforaied using different techniques, such as 
20 by employing a needle or a needless injection system. An example of a needless 
injection system is a jet injection device. (Donnelly et ah, Intemational Publication 
Number WO 99/524630 

A. Electrically Mediated Transfer 

Electrically mediated transfer or Gene Electro-Transfer (GET) can be 
performed by delivering suitable electric pulses after nucleic acid injection. (See 
Mathiesen, Intemational Publication Number WO 98/43702). Plasmid injection and 
electroporation can be performed using stainless needles. Needles can be used in 
couples, triplets or more complex patterns. In one configuration the needles are 
30 soldered on a printed circuit board that is a mechanical support and connects the 
needles to the electrical field generator by means of suitable cables. 

The electrical stimulus is given in the form of electrical pulses. Pulses 
can be of different forms (square, sinusoidal, triangular, exponential decay) and 
different polarity (monopolar of positive or negative polarity, bipolar). Pulses can be 
delivered either at constant voltage or constant current modality. 
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Different patterns of electric treatment can be used to introduce nucleic 
acid vaccines including HCV and other nucleic acid vaccines into a patient Pbssibld; ' 
patterns of electric tieatmem include the fdllowin^^ 

Treatment 1: 10 trains of 1000 square bipolair. pulses delivered every 
5 other second, pulse length 0.2 msec/phaise, frequency 1000 Hz^ constant voltage 
mode, 45 Volts/phase, floating current. 

Treatment 2: 2 trains of 100 square bipolar pulses delivered every bthbir 
second, pulse length 2 nisec/phase, frequency 100 Hz, constant current mode, 100 
mA/phase, floating voltage. 
10 Treatment 3: 2 trains of bipolar pulses; at a pulse length of about 2 

msec/phase, for a total length of about 3 seconds, where the actual current going 
through the tissue is fixed at about 50 mA. 

Electric pulses are delivered through an electric field generiatdn A - 
suitable generator can be composed of three independent hardware elements 
15 assembled in a cdinmoh chstssis and driven by a portable PC which runs the driving 
progrsam. The software manages both basic and accessory functions. The elementis of 
the device are: (1) signal generator driven by a microprocessor, (2) power amplifier v; 
atid (3) digital oscilloscope. 

The sigpjd generator delivers signals having arbitrary f^ 
20 shape in a given range under software control: The same software has an interactive 
. editor for the waveform to be delivered. The generator features a digitally controlled 
current liririiting device (a safety feature to control the maximal current butpiit). The 
power amplifier can amplify the sigrial generated up to +A 150 V. The oscilloscope is; 
digital and is able to sairiple both the voltage and the current being delivered by the 
25 OTiplifier. 

B.^ Pharmaceutical Carriers 

Phamiac6utically acceptable carriers facilitate storage and 
administration of a vaccine to a subject: Examples of pharmaceutically acceptable 
30 caitiei^ are described herein. Additional pharmaceutical acceptable carriers^^ 
known in the art. 

Pharmaceutically acceptable carriers rhay contain different componeiite 
such a buffer^ ndimal saline or phosphate buffered saline, sucrose, salts and 
pdlysorbate. An example of a pharmaceutically acceptable carrier is follows: 2.5-10 
35 mM TRIS buffer, preferably about 5 mM TRIS buffer; 25-100 mM NaCl, preferably 
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about 75 mM NaCl; 2,5-10% sucrose, preferably about 5% sucrose; 0.01 -2 mM 
MgCh; and 0.001%-0.01% polysorbate 80 (plant derived). The pH is preferably from 
about 7.0-9.0, mote preferably about 8.0. A specific example of a carrier contains 5 
mM TRIS, 75 mM NaCl, 5% sucrose. 1 mM MgCh, 0.005% polysorbate 80 at pH 
5 8.0. . 

G. Dosing Regimes 

Suitable dosing regimens can be determined taking into account the 
efficacy of a particular vaccine and factors such as age, weight, sex and medical 
10 condition of a patient; the route of administration; the desired effect; and the number 
of doses. The efficacy of a particular vaccine depends on different factors such as the 
ability of a particular vaccine to produce polypeptide that is expressed and processed 
in a cell and presented in the context of MHC class I and II complexes. 

HCV encoding nucleic acid administered to a patient can be part of 
15 different types of vectors including viral vectors such as adenovector, and DNA 
plasmid vaccines. In different embodiments concerning administration of a DNA 
plasmid, about 0.1 to 10 mg of plasmid is administered to a patient, and about 1 to 5 
mg of plasmid is administered to a patient. In different embodiments concerning 
administration of a viral vector, preferably an adenoviral vector, about 105 to lOH 
20 viral particles are administered to a patient, and about 107 to lOlO viral particles are 
administered to a patient. 

Viral vector vaccines and DNA plasmid vaccines naay be administered 
alone, or may be part of a prime and boost administration regimen. A mixed modality 
priming and booster inoculation involves either priming with a DNA vaccine and 
25 boosting with viral vector vaccine, or priming with a viral vector vaccine and boosting 
with a DNA vaccine. 

Multiple priming, for example, about to 2-4 or more may be used. The 
length of time between priming and boost may typically vary from about four months 
to a year, but other time frames may be used. The use of a priming regimen with a 
30 DNA vaccine may be preferred in situations where a person has a pre-existing anti- 
adenovirus immune response. 

In an embodiment of the present invention, 1x10^ to 1x10*^ particles 
and preferably about 1x10^^ to 1x10^' particles of adenovector is administered directly 
into muscle tissue. Following initial vaccination a boost is performed with an 
35 adenovector or DNA vaccine. 
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In another embodiment of the present invention initial vaccination is 
I)erformed with a DNA vaccine directly into muscle tissue. FoU^ 
vaccination a boost is performed with an adenovector or DNA vaccine. 

Agents such as interleukin-12, GM-CSF, B7-1, B74, IPIO, Mig-1 can 
5 be coadministCTed to boost the imimune response. The agents can be coadministered 
ais proteins or through use of nucleic acid vectors. 

D. Heteiplogous Prime-Boost 

Heterologous prime-boost is a mixed modality involving the use of one 

10 type of viral vector for priming and another type of viral vector for boosting. The 
heterologous prime-boost can involve related vectors such as vectors based on 
different adenovirus serotypes and more distantly related viruses such adenovirus and 
poxvirus. The use of poxvirus and adenovims vectors to protect mice against malaria 
is illustrated by Gilbert et al. . Vaccine 20:1039-1045, 2002> 

15 Different embodiments concerning priming and boosting involve the 

following types of vectors expressing desired antigens such as Met-NS3-lsrS4A- 
NS4B-NS5 A-NS5B: Ad5 vector followed by Ad6 vector, Ad6 VMtor followed by 
Aids vector; Ad5 vector followed by poxvims vector; poxvirus Vector followed by 
Ad5 vector; Ad6 vector followed by poxvirus vector; and poxvirus vector followed by 

20 Ad6 vector. 

The length of time between priming and boosting typically varies from 
about four rnonths to a year, but other time frames may be used. The minimum time ^ 
frame should be sufficient to allow for an immunological rest. In an embodiment, this 
rest is for a period of at least 6 months. Priming may involve multiple priming with 

25 one type of vector, such as 2-4 primingSi 

Expression cassettes present in a pox vinis Vector shdi^^^ 
prbiiioter either native to, or derived fitoih, the poxvirus of interest or Mother poxvirus 
niember Dif fereiit strategies for constructing and employing different typos bf 
poxvims based Vectors including those based on vaccinia virus, modified vaccinia 

30 virus, avipbxviras, raccoon poxvirus, modified vaccinia virus Ankara, - 

canarypoxviriiscs (such as ALVAC), fowlpoxviruses, cdwpoxvimses, and NYV 
are well known In the art. (Moss^ Current Topics in Microbiology and Immunology 
755:25-38, 1982; Earl et al. In Current Protocols in Molecular Biology^ Ausubel^r 
at eds.. New York: Greene Publishing Associates & Wiley Interscience; 

35 1991:16.16.1-16.167, Child et a/., Virology 174(2):625-9, 1990; Tartaglia et aL, . 
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Virology 188:217-232, 1992; U.S. Patent Nos., 4,603,112, 4,722,848, 4,769,330, 
5,110,587, 5,174,993, 5,185,146, 5,266,313, 5,505,941, 5,863,542, and 5,942,235. 

E. Adjuvants 

5 HCV vaccines can be formulated with an adjuvant. Adjuvants are 

particularly useful for DNA plasmid vaccines. Examples of adjuvants are alum, 
AIPO4. alhydrogel, Lipid-A and derivatives or variants thereof, Freund's incomplete 
adjuvant, neutral liposomes, liposomes containing the vaccine and cytokines, non- 
ionic block copolymers, and chemokines. 
10 Non-ionic block polymers containing polyoxyethylene (POE) and , 

polyxylpropylene (POP), such as POE-POP-POE block copolymers may be used as an 
adjuvant CNewman et cd.. Critical Reviews in Therapeutic Drug Carrier Systems 
75:89-142, 1998.) The immune response of a nucleic acid can be enhanced using a 
non-ionic block copolymer combined with an anionic surfactant. 
15 A specific example of an adjuvant formulation is one containing CKL- 

1005 (CytRx Research Laboratories), DNA, and benzylalkonium chloride (BAK). 
The formulation can be prepared by adding pure polymer to a cold (< 5^C) solution of 
plasmid DNA in PBS using a positive displacement pipette. The solution is then 
vortexed to solubilize the polymer. After complete solubilization of the polymer a 
20 clear solution is obtained at temperatures below the cloud point of the polymer (~6^ 
7**C). Approximately 4 mM BAK is then added to the DNA/CRL-1005 solution in 
PBS, by slow addition of a dilute solution of BAK dissolved in PBS. The initial DNA 
concentration is approximately 6 mg/mL before the addition of polymer and BAK, 
and the final DNA concentration is about 5 mg/mL. After BAK addition the 
25 formulation is vortexed extensively, while the temperature is allowed to increase from 
- 2°C to above the cloud point The formulation is then placed on ice to decrease the 
temperature below the cloud point. Then, the formulation is vortexed while the 
temperature is allowed to increase from -2^0 to above the cloud point. Cooling and 
mixing while the temperature is allowed to increase from -2°C to above the cloud 
30 point is repeated several times, until the particle size of the formulation is about 200- 
500 nm, as measured by dynamic light scattering. The formulation is then stored on 
ice until the solution is clear, then placed in storage at -70^C. Before use, the 
formulation is allowed to thaw at room temperature. 

35 
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F, Vaccine Storage 

Adenovector and DNA vaccines can be stoj«l using different ty^ 
buffers. For example, buffer A105 described in Example 9 infra, can be used to for 
vector storage. 

5 Storage of DNA can be enhanced by removal or chelatioti of trace ^ 

metal ions. Reagents such as succinic or malic acid, and chelators can be used to 
enhance DNA vaccine stability. Examples of chelators include multiple phosphate , 
ligands and EDTA. The inclusion of non-reducing free radical scavengets, such as 
ethanoi or glycerol, can also be useful to prevent damage of DNA plasmid from free 

10 radical production. Furthermore, the buffer type, pH, salt concentration, light 

exposure, as well as the type of sterilization process used to prepare the vials; may be 
controlled in the formulation to optimize the stability of the DNA vapcine. 

Vn. EXAMPLES 

15 Examples are provided below to further illustrate different features,.pf , : 

the present invention. The examples also illustrate useful methodology for practicing 
the invention. These examples do not limit ^e claimed invention. . 

Example 1: Met->NS3-NS4A-NS4B-NS5A^NS5B Expresision Cassettes 

20 Different g^ne expression cassettes encoding HCV NS3-NS4A-NS4B^ 

- NS5A-NS5B were constructed based on a lb subtype HCV BK strain^ The. encoded 
sequences had either (1) an active NS5B sequence ("NS'O, (2) an 
sequence (*'NSmut")> (3) a codon optimized sequence with an inactive NS5B 
sequence ("NSOPTmiit"). The expression cassettes also contained a CMV 

25 pronloter/enhancer and the BGH polyadenylati on signal. 

The NS nucleotide sequence (SEQ. BE). NO. 5) differe from HCV BK 
strain GenBank accession number M58335 by 30 out of 5952 nucleotides: The NS ' 
amino acid sequence (SEQ. JD. NO. 6) differs from the corresponding lb genotype 
HCV BK strain by 7 out of 1984 amino acids. To allow for initiation of tratislatioii an 

30 ATG codon is present at the 5* end of the NS sequence. A TGA tennination sequence, 
is present at the 3' end of the NS sequence. . 

The NSmut nucleotide sequence (SEQ. ID. NO. 2, Figure 2), is similar 
to the NS sequence. The differences between NSmut and NS include NSmiut having 
an altered NS5B catalytic site; an optirrial ribosome binding site at the 5* end; and a 

35 TAAA termination sequence at the 3' end. The alterations in NS5B comprise bases . 



37 



wo 03/031588 



PCT/US02/32512 



5138 to 5146, which encode amino acids 1711 to 1713. The alterations result in a 
change of aniino acids GlyAspAsp into AlaAlaGly and creates an inactive fonn of tiie 
NS5B RNA-dependent RNA-polymerase NS5B. 

The NSOPTmut sequence (SEQ. ID. NO. 3, Figure 3) was designed 
5 based on the amino acid sequence encoded by NSmut. The NSmut amino acid 
sequence was back translated into a nucleotide sequence with the GCG (Wisconsin 
Package version 10, Genetics Computer Group, GCG, Madison. Wise.) 
B ACKTRANSLATE program. To generate a NSOPTmut nucleotide sequence where 
each amino acid is coded for by the corresponding most frequently observed humaii 
10 codon, the program was run choosing as parameter the generation of the most 
probable nucleotide sequence and specifying the codon frequency table of highly 
expressed human genes (human_high.cod) available within the GCG Package as 
translation scheme. 

15 Example 2: Generation pVl Jns plasmid with NS. NSmut or NSOPTmut Sequences 
pVlJns plasmids containing either the NS sequence, NSmut sequence 
or NSOPTmut sequences were generated and characterised as follows: 

pVlJns Plasmid with the NS Sequence 
20 The coding region Met-NS3-NS4A-NS4B-NS5A and the coding 

region Met-NS3.NS4A-NS4B.NS5A-NS5B from a HCV BK type strain (Tomei et 
al, 7. Virol (57:4017-4026, 1993) were cloned into pcDNA3 plasmid (Invitrogen), 
generating pcD3-5a and pcD3-5b vectors, respectively. PcD3-5A was digested with 
Hind m, blunt-ended with Klenow fill-in and subsequently digested with Xba I, to 
25 generate a fragment corresponding to the coding region of Met-NS3-NS4A-NS4B- 
NS5A. The fragment was cloned into pVlJns-poly, digested with Bgl H blunt-ended 
with Klenow fill-in and subsequently digested with Xba I, generating pVlJnsNS3-5A. 

pVlJns-poIy is a derivative of pVlJnsA plasmid (Montgomery et al., 
UNA and Cell Biol 12:771-7^3, 1993), modified by insertion of a polylinker 
30 containing recognition sites for Xbal, Pmel, Pad into the unique Bgin and Nofl 
restriction sites. The pVlJns plasmid with the NS sequence (pVl JnsNS3-5B) was 
obtained by homologous recombination into the bacterial strain BJ5183, co- 
transforming pVl JNS3-5A linearized with Xbal and Nofl digestion and a PCR 
fragment containing approximately 200 bp of NS5A, NS5B coding sequence and 
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approximately 60 bp of the BGH polyadenylation signal. The resulting plasmid 
represents pVlJns-NS. : . ^ 

pVlIns-NS can be summarized as follows: 
Bases 1 to 1881 ofpVlJnsA 

5 an additional AGCTT 

then the Met-NS3-NS5B sequence (SEQ. ID. NO. 5) 

then the wtTGAstop 

an additional TCTAGAGCGTITAAACCCTTAATTAAGG (SEQ. ID. 

NO. 14) 

10 Bases 1912to4909 of pVlJnsA 

pVlJns Plasmid with the NSmut Sequence 

The VlJnsNS3-5A plasncud was modified at the 5' of the NS3 codiri^^^ 
sequence by addition of a full Kozak sequence. The plasmid (VlJNS3-5Akdzak) was 

15 obtained by homologous recombination into the bacterial sti^n BJ5 

trmsfortning VI JNS3-5A linearized by A/HI digestion and a PGR fragment containing 
the proximal part of Intron A, the restriction site Bgin,. a full Kozak translation 
initiation sequence and part of the NS3 coding sequence. 

The resulting plasmid (VI JNS3^5Akbzak) was linearized with Xba I 

20 digestion and co-transformed into the bacterial strain BJ5183 with a PGR fragment; ; 
containing approximately 200 bp of NS5 A, the NS5B mutated sequence, the strong 
tranislation termination TAAA and approximately 60 bp of the BGH polyadenylation 
signal. The PGR fragment was obtained by assembling two 22bp-overlapping 
fragments where mutations were introduced by the oligonucleotides used for their 

25 anlplification. The resulting plasmid represents pVlJns-NSmut. 

pVlJns-NSmut can be summarized as follows: . 
Bases 1 to 1882 of pVl JnsA 

tiien the kozak Met-NS3-NS5B(mut) TAAA sequence (SEQ, ID. NO. 2) 
an additional TGTAGA 
30 Bases 1925 to 4909 of pVlJhsA 

pVlJns Plasmid with the NSOPTmut Sequence 

The human codon-optimized synthetic gene (NSOPTmut) with 
mutated NS5B to abrogate enzymatic activity, full Kozak translation initiation 
35 sequence and a strong translation termination was digested with BamHI and SaU 
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10 



restriction sites present at the 5' and 3' end of the gene. The gene was then cloned 
into the Bgin and Sail restriction sites present in the polylinker of pVlJnsA plasmid, 
generating pVlJns-NSOPTmut. 

pV 1 Jns-NSOPTmut can be summarized as follows: 
Bases 1 to 1 88 1 of pVlJnsA 

an additional C 

then kozak Met-NS3-NS5B(optmut) TAAA sequence (SEQ, ID. NO. 3) 

an additional TTTAAATGTTTAAAC (SEQ. ID. NO. 15) 
Bases 1905 to 4909 of pVl JnsA 



Plasmids Characterization 

Expression of HC V NS proteins was tested by transfection of HEK 
293 cells, grown in 10% FCS/DMEM supplemented by L-glutamine (final 4 mM). 
Twenty-four hours before transfection, cells were plated in 6-well 35 mm diameter, to 

15 reach 90-95% confluence on the day of transfection. Forty nanograms of plasmid 

DN A (previously assessed as a non-saturating DNA amount) were co-transfected with 
100 ng of pRS V-Luc plasmid containing the luciferase reporter gene under the control 
of Rous sarcoma virus promoter, using the UPOFECTAMINE 2000 reagent Cells 
were kept in a CO2 incubator for 48 hours at 37 ""C, 

20 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 

were normalized for Luciferase activity, and run in serial dilution on 10% SDS- 
acrylamide gel. Proteins were transferred on nitrocellulose and assayed with 
antibodies directed against NS3, NS5A and NS5B to assess strength of expression and 
correct proteolytic cleavage. Mock-transfected cells were used as a negative control. 

25 Results from representative experiments testing pVlJnsNS, pVl JnsNSmut and 
pVl JnsNSOPTmut are shown in Figure 12. 

Example 3: Mice Immunization with Plasmid DNA Vectors 

The DNA plasmids pVl Jns-NS, pVl Jns-NSmut and pVlJns- 
30 NSOPTmut were injected in different mice strains to evaluate their potential to elicit 
anti-HCV immune responses. Two different strains (Balb/C and C57Black6. N=9-10) 
were injected intramuscularly with 25 or 50 |Lig of DNA followed by electrical pluses. 
Each animal received two doses at three weeks interval. 

Humoral immune response elicited in C57Black6 mice against the NS3 
35 protein was measured in post dose two sera by ELIS A on bacterially expressed NS3 
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protease domain. Antibodies specific for the tested antigen were detected in animals 
immunized with all tiuee vectors with geometric mean titers (GMT) ranging from 
94000 to 133000 (Tables 1-3). 



5 Table 1; pVlins-NS 





GMT 


Mice 


I 


2 


3 


4 


5 


6 


7 


8 . 


9 




Titer 


105466 


891980 


78799 


39496 


543542 


182139 


32351 


95028 


67800 


94553: 



Table 2: pVljns-NSmut 

10 • 





GMT 


Mice 
n. 


n 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


202981 


55670 


130786 


49748 


17672 


174958 


44304 


37337 


78182 


193695 


75683 



Table 3: pVljrts-iSrSOPTrnut 





GMT 


Mice 
n.' 




22 


23 


24, 


25 


26 


27 


28 


29 


30 




titer 


310349 


43645 


63496 


82174 


630778 


297259 


66861 


146735 


173506 


77732 


1331 65;- 



15. . . • ' ' ■;:y r;. 

A T cell response was measured in C57Black6 m 
two intramuscular injections at three weeks interval with 25 (ig of plasmid DNA. 
Quantitative EUspot assay was performed to determine the number of IFNy secreting 
T cells in response to five pools of 20nler peptides overlapping by ten residues : 

20 encompassing the NS3-NS5B sequence. Specific CD8+ response was analyzed by the ^ 
same assay using a 20mer peptide eiicdnipassing a CD8+ epitope for G57Black6 iiiice . 
(pepl480). r;^^ 

Cells secreting IFNy in dri antigen specific-manhej: w^re detected usiii^ 
a standard ELIspot assay. T d&il resporik^ . 

25 intramuscular injections at three weeks interval with 50 ^g of plasmid DNA, was 
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analyzed by the same EUspot assay measuring the number of IFNy secreting T cells 
in response to five pools of 20mer peptides overlapping by ten residues encompassing 
the NS3-NS5B sequence. 

Spleen cells were prepared from immunized mice and re-suspended in 
RIO medium (RPMI 1640 supplemented with 10% FCS, 2 mM L-Glutamine, 50 
U/ml-50Mg/ml Penicillin/Streptomycin. 10 mM Hepes, 50 fiM 2-men:apto-ethanol). 

Multiscreen 96-well Filtration Plates (Mimpore, Cat. No. MAIPS4510, MilUporc 
Corporation, 80 Ashby Road Bedford, MA) were coated with purified rat anti-mouse 
INFy antibody (PharMingen. Cat. No. 18181D, PhanniMingen, 10975 Toneyana 
Road. San Diego, Califomia92121-llll USA). After overnight incubation, plates 
were washed with PBS lX/0.005% Tween and blocked with 250 jwl/well of RIO 
medium. 

Splenocytes from immunized mice were prepared and incubated for 
twenty-four hours in the presence or absence of 10 ^tM peptide at a density of 2.5 X 
15 lO'/well or 5 X lO'/well. After extensive washing (PBS lX/0.005% Tween), 
biotinylated rat anti-mouse IFNy antibody (PharMingen, Cat. No. 181 12D, 
PharMingen, 10975 Torreyana Road. San Diego, California 92121-1 1 1 1 USA) was 
added and incubated overnight at 4" C. For development. streptavidin-AKP 

(PharMingen, Cat. No. 13043E, PharMingen, 10975 Torreyana Road, San Diego 
20 California 92121-1111 USA) and l-Step™ NBT-BCIP development solution (Pierce. 

Cat. No. 34042, Pierce. P.O. Box 1 17, Rockford, IL 61 105 USA) were added. 

Pools of 20mer overiapping peptides encompassing the entire sequence 

of the HCV BK strain NS3 to NS5B were used to reveal HCV-specific IFNy-secreting 

T cells. Similarly a single 20mer peptide encompassing a CD8+ epitope for 
25 C57BIack6 mice was used to detect CDS response. Representative data from groups 

of e57Black6 and Balb/C mice (N=9-10) immunized with two injections of 25 or 50 

M€ of plasmid vectors pVl Jns-NS, pVlJns-NSmut and pVlJns-NSOPTmut are shown 

in Figures 13A and 13B. 

30 Example 4: Immuniza tion of Rhasus Macaq ues 

Rhesus macaques (N=3) were immunized by intramuscular injection 
with 5mg of plasmid pVl Jns-NSOPTmut in 7.5mg/ml CRL1005, Benzalkonium 
chloride 0.6 mM. Each animal received two doses in the deltoid muscle at 0. and 4 
weeks. 
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CMI was measured at different time points by IFN-Y ELJSPOT, This 
assay measures HCV antigen-specific GD8+ and CD4+ T lymphocyte responses, and 
can be used for a variety of mammals, such as humans, rhesus monkeys, mice, and 
rats. 

5 The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamina EUtSPOT assays and 
inteiferon-gamrha intracellular staining assays^ Peptides based on the amino acid 
sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5A, NS5B) V/cre 
prepared for use in these assays to measure immune responses in HG V DNA and 

10 adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 
The individual peptides are overlapping 20-niers, offset by 10 amino acids* Large 
pools of peptides can be used to detect an overall response to HCV proteins while 
smaller pools and individual peptides may be used to define the epitdpe specifidty of 
a response. 

.15 - 

IFNyELISPOT 

The IFNy-EUSPOT assay provides a quantitative determination of 
HCV-specific T lymphocyte responses. PBMC are serially diluted and placed in 
microplate wells coated with anti-iiiesus BErN-Y OTtibody (MD-1 U-Cytgdh). They 2at 

20 cultured with a HCV peptide pool for 20 hours, resulting in the restimulation of the 
precursor cells and secretion of IFN-y. The cells are washed away, leaving the 
secreted IFN bound to the antibody-coated wells in concentrated areas where the cells 
were sitting. The captured IFN is detected with biotinylated anti-rhesus IFN antibody; 
(detector Ab U-Cytech) followed by alkaline phosphatase-conjugated streptavidin 

25 (Pharmingen 13043E), The addition of insoluble alkaline phosphatase substrate 

results in dark spots in the wells at the sites where the cells were located, leaving one: 
spot for each T cell that secreted IFN-y. 

The number of spots per well is directly related to the precursor 
frequency of antigen-specific T cells. Gamma interferon was selected as the cytokine 

30 visualized in this assay (using species specific anti-ganmia interferon monoclonal 
antibodies) because it is the most common, and one of the most abundant cytokines 
synthesized and secreted by activated T lymphocytes. For this assay, the number of 
spot forming cells (SFC) per million PBMCs is determined for samples itl the 
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presence and absence (media control) of peptide antigens. Data from Rhesus 
macaques on PBMC from post dose two material are shown in Table 4. 



Table 4 





PVlJ-NSOPTmut 


Pep pools 


21G 


99C161 


99C166 


F (NS3p) 


8 


10 


170 


G(NS3h) 


7 


592 


229 


H (NS4) 


3 


14 


16 


I(NS5a) 


5 


71 


36 


L (NS5b) 


14 


23 


11 


M(NS5b) 


3 


35 


8 


DMSO 


2 


4 


5 



INFyELISPOT on PBMC from Rhesus monkeys immunized with two injections of 5 
5 mg DNA/dose in OPTIVAX/B AK of plasmid pVl Jns-NSOPTmut Data are 
expressed as SFC7 106 PBMC. 



Example 5: Construction of Ad6 Pre^Adenovirus Plasmids 

Ad6 pre-adenovirus plasmids were obtained as follows: 

Construction ofpAd6 El-E3'¥ Pre-adenovirus Plasmid 

An Ad6 based pre-adenovirus plasmid which can be used to generate 
first generation Ad6 vectors was constructed either taking advantage of the extensive 
sequence identity (approx. 98%) between Ad5 and Ad6 or containing only Ad6 
regions* Homologous recombination was used to clone wtAd6 sequences into a 
bacterial plasmid. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
containing Ad5 and Ad6 regions is illustrated in Figure 10, Cotransformation of BJ 
5183 bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed 
the Ad5 ITR cassette resulted in the circularization of the viral genome by 
homologous recombination. The ITR cassette contains sequences from the right (bp 
33798 to 35935) and left (bp 1 to 341 and bp 3525 to 5767) end of the Ad5 genome 
separated by plasmid sequences containing a bacterial origin of replication and an 
ampicillin resistance gene. The ITR cassette contains a deletion of El sequences from 
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Ad5 342 to 3524. The Ad5 sequences in the ITR cassette provide regions of 
homology with the purified Ad6 viral DNA in which recombination can occur. 

Potential clones were screened by restriction analysis aiid one clone 
was selected as pAd6El-E3+. This clone was then sequenced in it entirety. pAd6El- 
5 E3+ contains Ad5 sequences from bp 1 to 341 and from bp 3525 to 5548, Ad6 bp v 
5542 to 33784. and Ad5 bp 33967 to 35935 (bp numbers refer to the wt sequence for 
both Ad5 and Ad6). pAd6El-E3+ contains the coding sequences for all Ad6 virion 
structural proteins which constitute its serotype speci 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
10 containing Ad6 regions is illustrated in Figure 1 1. Gotransformation of BJ 5 183 

bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed the Ad6 
ITR cassette resulted in the circularization of the viral genonie by homologous 
recombination. The ITR cassette contains sequenceis from the right (bp 35460 to 
35759) and left (bp 1 to 450 and bp 3508 to 3807) eiid of the Ad© genome separated 
15 by plasmid sequences containing a bacterial origin of replication and an ampicillin 
resistance gene. These three segmems were generated by PCIR and cloiied 
sequeritially into pNEB193, generating pNEBAd6-3 (the ITR cassette). The FTR 
cassette contains a deletion of El sequences from Ad5 45 1 to 3507: The Ad6 
sequences in the ITR cassette provide regions of homology with the pUiified Ad6 viral 
20 DNA in which recombiiiation can occur. 

ComtmctionofpAd6 E1-E3- pre^culenovirus plasmids 

Ad6 based vectors containing A5 regions and deleted in the E3 region = 

were constructed starting with pAd6El-E3+ containing Ad5 regions. A 5322 bp 
25 subfragiiient of pAd6El-E3+ containing the E3 region (Ad6 bp 2587 1 to 3 1 192) was 

subclbned into pABS.3 generating pABSAd6E3. Three E3 deletions Were then made 

in this plasmid generating three new plasmids pABSAd6E3(1.8Kb) (deleted for Ad6 

bp 28602 to 30440), pABSAd6E3(2.3Kb) (deleted fdf Ad6 bp 28157 to 30437) and . 

pABSAd6E3(2.6Kb) (deleted for Ad6 bp 28157 to 30788). fiacterial recombination 
30 was then iised to substitute the three E3 deletions back into pAd6El-E3+ generatirig . 

the Ad6 gehdiTie plasmids pAd6El-E3-1.8Kb. pAd6El-E3^2.3Kb aiid pAd^l-ES-^ 

2.6Kb. 
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Example 6: Generatio n of Ad5 Genome Plasmid with the NS Sequence 

A pcDNA3 plasmid (Invitrogen) containing the coding region NS3-NS4A- 
NS4B-NS5A was digested with XmnI and Nnd restriction sites and the DNA fragment 
containing the CMV promoter, the NS3-NS4A-NS4B-NS5A coding sequence and the . 
Bovine Growth Hormone (BGH) polyadenylation signal was cloned into the unique EcorV 
5 restriction site of the shuttle vector pDelElSpa, generating the S va3-5 A vector. 

A pcDNAB plasmid containing the coding region NS3-NS4A-NS4B- 
NS5A-NS5B was digested with XmnI and Ecorl (partial digestion), and the DNA 
fragment containing part of NS5A, NS5B gene and the BGH polyadenylation signal 
was cloned into the S va3-5 A vector, digested Ecorl and Bgtll blunted with Klenow, 
id generating the Sva3-5B vector. 

The Sva3-5B vector was finally digested Sspl and Bstl 1071 restriction 
sites and the DNA fragment containing the expression cassette (CMV promoter, NS3- 
NS4A-NS4B-NS5A-NS5B coding sequence and the BGH polyadenylation signal) 
flanked by adenovirus sequences was co-transfonned with pAdSHVO (E1-JB3-) Clal 
15 linearized genome plasmid into the bacterial strain BJ5183, to generate pAdSHVONS. 
pAdSHVO contains Ad5 bp 1 to 341, bp 3525 to 28133 and bp 30818 to 35935. 

Example 7: Generation of Adenovirus Genome Plasmids with the NSmut Sequence 
Adenovirus genome plasmids containing an NS-mut sequence were 

generated in an Ad5 or Ad6 background. The Ad6 background contained Ad5 regions 

20 at bases 1 to 450, 35 1 1 to 5548 and 33967 to 35935. 

pVl JNS3-5 Akozak was digested with BglH and Xbal restriction 
enzymes and the DNA fragment containing the Kozak sequence and the sequence 
coding NS3-NS4A-NS4B-NS5A was cloned into a Bgia and Xbal digested 
polypMRKpdelEl shuttle vector. The resulting vector was designated shNS3- 

25 5Akozak. 

PolypMRKpdelEl is a derivative of RKpdelEl(Pac/pIX/pack450) + 
CMVmin+BGHpA(str,) modified by the insertion of a polylinker containing 
recognition sites for BgUI, Pmel, Swal, Xbal, Sail, into the unique Bgin restriction 
site present downstream the CMV promoter. MRKpdelEl(Pac/pIX/pack450) + 
30 CMVmin + BGHpA(str.) contains Ad5 sequences from bp 1 to 5792 with a deletion 
of El sequences from bp 45 1 to 35 10. The human CMV promoter and BGH 
polyadenylation signal were inserted into the El deletion in an El parallel orientation 
with a unique Bgin site separating them. 
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The NS5B fragment, mutated to abrogate enzymatic activity and with a : 
strong translation termination at the 3* eiid^ was obtained by assembly PGR and 
ihserted into the shNS3-SAkozak vector via homologous recombination^ generating ; 
polypMRKpdelElNSmut. In polypMRKpdelElNSmut the ^fS-mut coding. sequence,^ 
S is under the control of CMV promoter and the BGH polyadenylation signal is preseh^^^^ 
downstream. 

The gene expression cassette and the flanking regions which contain 
adenovirus sequences allowing homologous recombination were excised by digestion 
with Pad and Bstl 1071 restriction enzymes and co-transformed with either 
10 pAdSHVO (E1-,E3-) or pAd6El-E3-2,6Kb Clal linearized genome plasmids into the 
bacterial strain BJ5183, to generate pAdSHVONSniut and pAd6El-,E3-NSmut, 
respectively, 

pAd6El-E3-2.6Kb contains Ad5 bp 1 to 341 find from bp 3525 to " 
5548, Ad6 bp 5542 to 28157 and from bp 30788 to 33784, and AdS bp 33967 to 
15 35935 (bp numbers refer to the wt sequence for both Ad5 and Ad6), in both plasmids 
the viral lTK*s are joined by plasniid sequences that contain the bacterial origin of 
replication and an ampicillin resistance gene. 

Example 8: Generation of Adenovirus Genome Plasmids with the NSOPTinut 

The human codon^optimized synthetic gene (NSOPTmiit) provided by 
20 SEQ, ID. NO. 3 cloned into a pGRBluht vector (Invitrogen) was digested witii BaMil 
aind Sail restriction enzymes and cloned into BgRt and Sail restriction sites present in 
the shuttle vector polypMRKpdelEL The resulting clone 

(polypMRKpdelElNSOPTrhut) was digested with Pad and Bstl 1071 restriction • ^ 
enzymes atid co-tiarisformed with either pAdSHVO (E1-,E3-) or pAd6El-E3-2,6Kb . 
2iS Cidl linearized genome plasmids, info the bacterial strain B J5 1 83, to generate 
pAd5HVONSOPTmUt arid pAd6ElsE3-NSOPTmut, fespecfi vely; , 

Example 9r Rescue and Amplification of Adenovirus Vectors 

Adenovectors were rescued in Per.6 cells. Per.C6 were grown in 10% 
30 PCS / DMEM supplerrierited by L-glutamine (final 4mM), penicilliri/streptomyciri ' 
(final 100 lU/ml) and 10 mM MgCl2. After infection, cells were kept in the same 

medium supplemented by 5% horse serum (HS). For viral rescue, 2.5 X 10^ Per.C6. 
were plated in 6 cm 0 Petri dishes. 
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Twenty-four hours after plating, cells were transfected by calcium 
y phosphate method with 10 |ig of the Pac I linearized adenoviral DNA. The DNA 

precipitate was left on the cells for 4 hours. The medium was removed and 5% 
HS/DMEM was added. 

^ Cells were kept in a CO2 incubator until a cytopathic effect was visible 

(Iweek). Cells and supernatant were recovered and subjected to 3X fteeze/thawing 
cycles (liquid nitrogen / water bath at 37°C). The lysate was centrifuged at 3000 ipm 
at - 4°C for 20 minutes and the recovered supernatant (corresponding to a cell lysate 
containing virus passed on cells only once; PI) was used, in the amount of 1 ml/ dish, 
10 to infect 80-90% confluent Per.C6 in 10 cm 0 Petri dishes. The infected ceUs were 
incubated until a cytopathic effect was visible, cells and supernatant lecoyei^ and the 
lysate prepared as described above (P2). 

P2 lysate (4 ml) were used to infect 2 X 15 cm 0 Petri dishes. The 
lysate recovered from this infection (P3) was kept in aliquots at -80**C as 6. stock of 
J5 virus to be used as starting point for big viral preparations. In this case, 1 ml of the 
stock was enough to infect 2 X 15 cm 0 Petri dishes and resulting lysate (P4) was used 
for the infection of the Petri dishes devoted to the large scale infection. 

Further amplification was obtained from the P4 lysate which was 
diluted in medium without PCS and used to infect 30 X 1 5 cm 0 Petri dishes (with 
20 Per.C6 80%-90% confluent) in the amount of 10 ml/dish. Cells were incubated 1 
hour in the CO2 incubator, mixing gently every 20 minutes. 12 ml / dish of 5% HS / 
DMBVt was added and cells were incubated until a cytopathic effect was visible 
(about 48 hours). 

Cells and supernatant were collected and centrifuged at 2K ipm for 20 
25 minutes at 4°C. The peUet was resuspended in 15 ml of 0.1 M Tris pH=8.0 Cells 
were lysed by 3X freeze/tiiawing cycles (liquid nitrogen / wat^ bath at 37®C). 150 iil 
of 2 M MgCl2 and 75 ^il of DNAse (10 mg of bovine pancreatic deoxyribonuclease I 
in 10 ml of 20 mM Tris-HCl pH= 7.4. 50 mM NaCl. 1 mM dithiotfireitol, 0.1 mg/ml 
bovine serum albumin, 50% glycerol) were added. After a 1 hour incubation at 37**C 
30 in a water bath (vortex every 15 minutes) the lysate was centrifuged at 4K rpm for 15 
minutes at 4°C. The recovered supernatant was ready to be applied on CsCl gradient. 
The CsCl gradients were prepared in SW40 ultra-clear tubes as 

follows: 

0.5 ml of 1.5d CsCl 
35 3 ml of 1.35d CsCl 
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3mlof 1.25dCsCl 

5-nil/ tube of viral supernatant was applied 

If necessary V the tubes were topped up with 0.1 M tris-Cl pH=8.0. 
Tubes were centrifuged at 35K rpm for 1 hour at -lO^C with rotor SW40. The viral 
5 band$ (located at the 1 .25/1.35 interface) were collected uising a syringe. 

The virus was transferred into a new SW40 ultraclear tube and 1.35d ; 
CsCl was added to top the tube up. After centrifugation at 35K rpm for 24 hours at : 
lO^C in the rdtoir SW40, the virus was collected in the smallest possible volume arid t 
dialyzed extensively against buffer A105 (5 mM Tris, 5% sucrose, 75 mM NaCl, 1 
10 mM MgCl2, 0.005% polysorbate 80 pH=8.0). After dialysis, glycerol was added to 

final 10% arid the virus was stored in aliqudts at SO^C. 

Example 10: Enhanced Adehovector Rescue 

First generatidri Ad5 and Ad6 vectors carrying HCV NSOPTmut 

15 transgene were fbund to be diffic^ult to rescue^ A possible block in the rescue process? 
riiight be attributed to an inefficient replication of plasmid DNA that is a sub-optinni^ 
template for the replication machinery of adenovirus. The absence of the terminal 
protein linted to the 5 'ends of the DNA (normally present in the viral DNA), 
associated with the very high G^C content of the transgene inserted in the El region of 

20 the vector, rnay be causing a substantial reduction in replication rate of the plasmid-; 
. defived adenovirus. . / - ; 

To set up a more efficient arid reproducible procedure for rescuing Ad 
vcctorSj an expression vector (pE2; Figure 19) containing all E2 proteins (polymerase^, 
pre-terminal protein and DNA binding protein) as well as E4 drf6 urider the bontrbl of 

25 tet-ihducible promoter was employed. The trahsfectioii of pE2 in combination with a 
noirmal preadeno plasmid irii PerC6 and in 293 leads to a strong iiicrease of Ad DNA^-^ 
replication arid to. a more efficient production of cornplete infectious aderibvirus 
particles. 

30 Plasmid Construction 

pE2 is based on the cloning vector pBI (CLONTECH) with the " ; ' 
addition of two elements to allow episornal replication aiid selection in cell culture: 
(1) the EBV-OriP (EBV [nt] 7421-8042) region permitting plasmid replication in 
synchrony with the cell cycle when EBNA-1 is expressed arid (2) the hygroniycin-B 

35 phosphotransferase (HPH)-resistance gene allowing a positive sielection of . 
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transformed cells. The two transcriptional units for the adenoviral genes E2 a and b 
and E4-Orf6 were constructed and assembled in pE2 as described below. 

The Ad5-Polymerase Clal/SphI fragment and the Ad5-pTP 
Acc65/EcoRV fragment were obtained from pVac-Pol and pVac-pTP (Stunnemberg et 
5 al. NAR 76:2431-2444, 1988). Both fragments were filled with Klenow and cloned 
into the Sail (filled) and EcoRV sites of pBI, respectively obtaining pBI-Pol/pTP, . 

EBV-OriP element from pCEP4 (Invitrogen) was first inserted within 
two chicken P-globin insulator dimers by cloning it into BamHI site of pJC13-l 
(Chung et aZ., Cell 74(3):505-l4, 1993). HS4^0riP fragment from pJC13.QriP was 
: 10 then cloned inside pS Almv (a plasmid containing tk-Hygro-B resistance gerie 

expression cassette as well as Ad5 replication origin), the ITR*s arranged as head-to- 
tail junction, obtained by PGR from pFG140 (Graham, EMBO J. 5:2917-2922, 1984) 
using the following primers: 5'-TCGAATCGATACGCGAACCTACGC-3' (SEQ. 
ID. NO. 16) and 5'.TCGACGTGTCGACTTCGAAGCGCACACCAAAAACGTC-3' 
15 (SEQ. ID. NO. 17), thus generating pMVHS40rip. A DNA fragment from 

pMVHS40rip, containing the insulated OriP, Ad5 ITR junction and tk-HygroB 
cassette, was then inserted into pBI-Ppl/pTP vector restricted Asel/AatU generating 
pBI.Pol/pTPHS4 . 

To construct the second transcriptional unit expressing Ad5-Orf6 as 
20 well as Ad5-DBP, E4orf6 (Ad 5 [nt] 33193-34077) obtained by PGR was first 
inserted into pBI vector, generating pBI-Orf6. Subsequently, DBP coding DNA 
sequence (Ad 5 [nt] 22443-24032) was inserted into pBI-Orf6 obtaining the second 
bi-directional Tet-regulated expression vector (pBI-DBP/E4orf6). The original polyA 
signals present in pBI were substituted with BGH and SV40 polyA. 
25 pBI-DBP/E4orf6 was then modified by inserting a DNA fragment 

containing the Adeno5-ITRs arranged in head-to-tail junction plus the hygromicin B 
resistance gene obtained from plasmid pSA-lmv. The new plasmid pBI- 
DBP/B4orf6shuttIe was then used as donor plasmid to insert the second tet-regulated 
transcriptional unit into pBI-Pol/pTPHS4 by homologous recombination using E. coll 
30 strain B J5 1 83 obtaining pE2. 

Cell lines. Transfections and Virus Amplification 

PerC6 cells were cultured in Dulbecco's modified Eaglets Medium 
(DMEM) plus 10% fetal bovine serum (FBS), 10 nriM MgCh, penicillin (100 U/ml). 
35 streptomycin (100 fig/ml) and 2 mM glutamine. 
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' V " All transient transfectioris were perfonned using Lipofectanune2000 

(Livitrogen) as described by the manufacturer. 90% confluent PERC.6'^ planted in . 
6-cm pljites weirfe tfansfected with 3.5 \ig of Ad5/6NSOPTinut pie-adeno piasmids, 
digested with Pad, alone or in combination with 5 jig pE2 plus 1 jig pUHD52.1. 
5 pUHD52. 1 is the expression vector for the reverse tet traiisactivator 2 (rtTA2) 
(Urlinger et al, Proc. Natl Acad. ScL U.SA. 97(J4>:7963-7968, 2000); Updn 

^ transfectioii, cells were cultivated in the presence of 1 ^g/ml of doxycycline to 

activate pE2 expression. 7 days post-transfection cells were harvested and cell lysate 
was obtained by three cycles of freeze-thaw. Two ml of cell lysate were used to infect 

10 asecond6-cmdishofPerC6. Infected cells were cultivated until a full GPE was 
observed then harvested. The virus was serially passaged five times as described 
above, then purified on CsCl gradient. The DNA structure of the purified virus was 
controlled by endonuclease digestion and agarose gel electrophoresis analysis and : ; 
compared to the original pre-aderio plasmid restriction pattern. 

15 

Example^ 11: Partial Optimizeation of HCV Polvprotein Encoding Nucleic acid 

Partial optimii^ation of HCV polyprotein encoding tiucleic acid was 
peif orined to facilitate the productibn of adenovectbrs containing codohs optimized' 
fot expression in a human host. The overall objective was to provide for increased 

20 expression due to codon optimization, while facilitating the prbductioti of an 
adenovector encoding HCV polyprotein. 

Several difficulties were encountered in producing ah adenovector 
Encoding HCV polyprotein with codons optimized for expression in a human host. An 
stdenovectdr containing an optimized sequence (SEQ. ID. NO. 3) was found to be 

25 more difficult to synthesize and rescue than an adenovector containing a iibn^ 
optimized sequence (SEQ. ID. NO. 2). 

The difficulties in producing an adenovector containing SEQi ID. NO. 
3 were attributed to a high GC content. A particularly problemetic region was the - 
region at about position 3900 of NSOFTmut (SEQ. ID. NO: 3). 

30 Alternative versions of optimized HCV encoding nucleic acid 

sequence were designed to facilitate its use in art adenovectdf . The alternative 
versions, compared to NSOPTmut, were designed to have a lower overall GC content, 
to reduce/avoid the presence of potentially problerhatic rhbtifis of consecutive G*s brv 
C*s, while maintaining a high level of codon optimization to allow improved 

35 expression of the encoded polyprotdn and the individual cleavage products. 
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A Starting point for the generation of a suboptimally codon-optimized 
s^uence is the coding region of the NSOPTmut nucleotide sequence (bases 7 to 5961 
of SEQ. ID. NO. 3). Values for codon usage frequencies (normalized to a total of 1.0 
for each amino acid) were taken from the file human Jugh.cod available in the 
; 5 Wisconsin Package Version 10.3 (Acceliys Inc., a whoUy owned subsidiary of 
Pharmacopeia, Inc). 

To reduce the local and overall GC content a table defining preferred 
codon substitutions for each amino acid was manually generated. For each amino acid 
the codon having 1) a lower GC content as compared to the most fiequent codon and 
10 2) a relativly high observed codon usage frequency (as defined in human_high.cod) { 
was chposen as the replacement codon. For example for Arg the codon with the 

highest frequency is CGC. Out of the other five alternative codons encoding Arg 
(CGG, AGO. AGA. CGT, CGA) three (AGO, CGT, CGA) reduce the GC content by 
1 base, one (AGA) by two bases and one (CGG) by 0 bases. Since the AGA codon is 

15 listed in human_high.cod as having a relatively low usage frequency (0.1), the codon 
substituting CGC was therefore choosen to be AGG with a relative frequency of 0.18. 
Similar criteria were applied in order to establish codon replacements for the other 
amino acids resulting in the list shown in Table 5. Parameters applied in the fpUowing 
optimization procedure were determined empirically such that the resulting sequence < 

20 maintained a considerably improved codon usage (for each amino acid) and the GC 

content (overall and in form of local stretches of consecutive G's and/or C's) was 
decreased. 

Two examples of partial optimized HCV encoding sequences are 
provided by SEQ. ID. NO. 10 and SEQ. ID. NO. 1 1. SEQ. ID. NO. 10 provides a 
25 HCV encoding sequence that is partially optimized throughout. SEQ. ID. NO. 1 1 
provides an HCV encoding sequence fiiUy optimized for codon usage with the 
exception of a region that was partially optimized. 

Codon optimization was performed using the following procedure: 
Step 1) The coding region of the input fully optimized NSOPTmut 
30 sequence was analyzed using a sliding window of 3 codons (9 bases) shifting the 
window by one codon after each cycle. Whenever a stretch containing 5 or more 
consecutive C's and/or G's was detected in the window the following replacement rule 
was applied: Let N indicate the number of codon replacements previously performed. 
If N is odd replace the middle codon in the window with the codon specified in Table 
35 5, if N is even replace the third terminal codon in the window with the codon 
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specified in a codon optimization table such as human_high.c6d. If Leu or Val is 
present at the second of third codon dd not apply any replacenierit in order riot to 
introduce Leu or Val cdddils with very low relative codon usage frequency (see, for 
example, human_high.cod). In the following cycle analysis of the shifted window was 
5 then applied to a sequence cotitaining the feplaQements of the previous cycle. 

The alternating replacement of the iniddle and tera^ 
codon window was found einpiriGally to give a more satisfying overall maintenance of 
optimized codbii usage while alsd reducing GC content (as judged from the final 
sequence after the procedure). In general, however, the precise replacemieiit strategy 

10 depends on the amino acid sequence encoded by the nucelotide sequence under 
analysis arid will haVe to be determined empirically. 

Step 2) The sequence containing all the codon replacements performed 
during step 1) was then subjected to an additional analysis using a sliding witidow of ? 
21 codofis (63 bases) in length: acmrdirig to an adjustable parameter the overall GC 

15 content in the window was determined. If the GC content iii the window was higher 
than 70% the following codon replacemetit 

replace the codoris for the aniino acids Asri ^ Asp^ Cys ^ Glu, His, He, Lys, Phe, Tyr by . 
the codons given in Table 5. Restriction of the replacement to this set of arqirio acids ; 
was mdtivated by the fact that a) the replacement codori still has an aqcetably high 

20 frequency of usage in humari^high.cod arid b) the average dverafll human codon usaige 
in CUTG for the replacement codori is nearly as high as the riiost frequent codon. Iri 
the following cycle arialysis of the shifted window is then applied to a sequence 
contairiifig the replaceriients of the previous cycle. 

The threshold 70% was detennined empirically by comprbrirusin 

25 betweeri ari overall reduction in GC content and niaintenarice of a high cddpn 

dptiriiizatidri for the individual airiirid adds^ As in step 1) the precise replaceinent 
stnateigy (choice of airiirid acids and GC coritent threshdld value) will again depend dri 
the amind aicid sequence encdded by the nucleotide sequence under arialysis arid viriil " 
: have td be determined empirically. 

30 Step 3) The sequence generated by steps 1) atid 2) was then nianuaily 

edited arid additional codoris wete changed according td the folldWirig; criteria: ' 
Regions still having a GC cdntent higher than 70% over a window of 21 codons were 
exarifiiried rbaniially arid a few codons were replaced again following the scheme given 
iriTableS. 
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Subsequent steps were perfonned to provide for useful restriction sites, 
remove possible open reading frames on the complementary strand, to add 
homologous recombinant regions, to add a Kozac signal, and to add a teraiinator. 
These steps are numbered 4-7 

Step 4) The sequence generated in step 3 was examined for the absence 
of certain restriction sites (BglE, Pmel and Xbal) and presence of only 1 StuI site to 
allow a subsequent cloning strategy using a subset of restriction enzymes. Two sites 
(one for Bgin and one for StuI) were removed from the sequence by replacing codons 
that were part of the respective recognition sites. 

Step 5) The sequence generated by steps 1) through 4) was then 
modified according to allow subsequent generation of a modified NSOPTmut 
sequence (by homologous recombination). In the sequence obtained from steps 1) 
through 4) the segment comprising base 3556 to 3755 and the segment comprising 
base 4456 to 4656 were replaced by the corresponding segments from NSOPTmut. 
15 TTie segment comprising bases 3556 to 4656 of SEQ, ID. NO. 10 can be used to 

replace the problematic region in NSOPTmut (around position 3900) by homologous 
recombination thus creating the variant of NSOPTmut having the sequence of SEQ. 
ID. NO. 11. 

Step 6) Analysis of the sequence generated through steps 1) to 5) 
20 revealed a potential open reading frame spanning nearly the complete fragment on the 
complementary strand. Removal of all codons CTA and TTA (Leu) and TCA (Ser) 
from the sense strand effectively removed all stop codons in one of the reading frames 
on the complementary strand. Although the likelyhood for transcription of this 
complementary strand open reading frame and subsequent translation into protein is 
25 very small, in order to exclude a potential interference with the transcription and 

subsequent translation of the sequence encoded on the sense strand. TCA codons for 
Ser were introduced on the sense approximately every 500 bases. No changes were 
introduced in the segments introduced during step 5) to allow homologous 
recombination. The TCA codon for Ser was preferred over the CTA and TTA codons 
for Leu because of the higher relative frequency for TCA (0.05) as compared to CTA 
(0.02) and TTA (0.03) in human_high,cod. In addition, the average human codon 
usage from CUTG favored TCA (0. 14 against 0.07 for CTA and TTA). 

Step 7) In a final step GCCACC was added at the 5' end of the 
sequence to generate an optimized internal ribosome entry site (Kozak signal) and a 
TAAA stop sgnal was added at the 3'. To maintain the initiation of translation 
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properties of NSsuboptiiiut the first 8 codons of the coding region were kept identical 
to the NSOPTmut sequence. The resulting sequence was agaiin checked for the 
absence of Bglll, Pmel and Xbal recognition sites and the presence of only 1 Stul site. / 
the NSsuboptmut sequence (SEQ. ID. NO. 10) has aii overall reduced : 
5 GC content (63.5%) as compared to NSOPTmut (70.3%) and maintains a well 
optimized level of codon usage optimization. Nucleotide sequence identity of 
NSsuboptmut is 77.2% with respect to NSmut. 

Table 5: Definition of codon replacements performed during steps 1) and 2). 



Amino Acid 


Most frequent 
codon 


Relative 
frequency 


Reduction in 
GC content 
(bases) 


Replacement 
codon 


Relative 
frequency . 


Amino Acids where the replacement codon reduces the codon GC^content by 1 base 


Ala 


GCC 


Q.51 




OCT 


0.17 


Arg 


CGC 


0.37 




AGG 


0.18 


Asn 


AAC 


0.78 




AAT 


0.22 


Asp 


GAC 


0.75 




GAT 


0.25 : ^ 


Cys 


TGC 


0.68 




TGT 


0.32 


Glu . 


GAG 


0.75 




GAA 


0.25 


Gin 


GAG 


0.88 




CAA 


0.12 


Gly 


GGG 


0.50 : 




; GGA 


0A4\S: 


His 


CAC 


0.79 




CAT 


0.21 


He 


ATC 


0.77 




ATT 


0.18 


Lys 


AAG 


0.82 




\ AAA 


0.18 : 


Phe 


TTC 


0.80 




TTT 


0.20 


Pro 


GCC 


0.48 




CCT 


0.19 . 


Scr 


AGO 


0.34 




TCT 


0.13 


Thr 


ACC 


0.51 




ACA 


0.14 


Tyr 


TAG 


0.74 




TAT 


0.26 ; 


Amino Acids with no alternative codon 


Met 


ATG 


1.00 


6 


ATG 


1.00 . 


Trp 


TGG 


1.00 


d 


TGG 


1.00 ■ 



■':t 
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Amino Acids where the replacement codon has a very low relative frequency. These amino acids were 




excluded from the replacement procedure 




Leu - 


CTG 


0.58 


1 


TTG 


0,06 


Val 


GTG 


0.64 


1 


GTT 


0.07 



Example 12: Virus Characterization 

Adenovectors were characterized by: (a) measuring the physical 
particles/ml; (b) running a TaqMan PGR assay; and (c) checking protein expression 
after infection of HeLa cells. 



10 



15 



20 



25 



a) Physical Particles Determmation 

CsCl purified virus was diluted 1/10 and 1/100 in 0.1% SDS PBS. As 
a control, buffer A105 was used. These dilutions were incubated 10 minutes at 55^C. 
After spinning the tubes briefly, O.D. at 260 nm was measured The amount of viral 
particles was calculated as follows: 1 OD 260 nm = 1.1 X 10^^ physical particles/ml. 
The results were typically between 5 X lO" and 1 X 10^^ physical particles /ml. 

b) TaqMan PCR Assay 

TaqMan PCFL assay was used for adenovectors genome quantification 
(Q-PCR particles/ml). TaqMan PCR assay was performed using the ABI Prism 7700- 
sequence detector. The reaction was performed in a final 50 ptl volume in the 
presence of oligonucleotides (at final 200 nM) and probe (at final 200 jiM) specific 
for the adenoviral backbone. The virus was diluted 1/10 in 0.1% SDS PBS and 
incubated 10 minutes at 55°C. After spinning the tube briefly, serial 1/10 dilutions (in 
water) were prepared. 10 fil the 10'\ 10"^ and 10'^ dilutions were used as templates in 
the PCR assay. 

The amount of particles present in each sample was calculated on the 
basis of a standard curve mn in the same experiment. Typically results were between 
1 X 10*^ and 3 X 10^^ Q-PCR particles /ml. 



c) Expression of HCV Non-Structural Proteins 

Expression of HCV NS proteins was tested by infection of HeLa cells. 
Cells were plated the day before the infection at 1,5 X 10^ cells/dish (10 cm 0 Petri 
30 dishes). Different amounts of CsCl purified virus corresponding to m.o,i, of 50, 250 
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and 1250 pp/cell were diluted in medium (PCS free) up to a final volume of 5 ml The 
diluted virus was added on the cells and incubated for 1 hour at 37^C in a CO2 
incubator (gently mixing every 20 minutes). 5 ml of 5% HS-DMEM was added and 
the cells were incubated at 37^C for 48 hours. 

5 Cell extracts >yere prepared in 1% Triton/TEN buffer. The extracts 

were run on 10% SDS-acrylamide gel, blotted oil nitrocellulose and assayed with 
antibodies directed against NS3, NS5a and NS5b in order to check the correct 
polyprotein cleavage. Mock-infected cells were used as a negative control. Results 
from representative experiments testing the Ad5-NS, MRKAdS-NSmut, MRKAd6-^ 

10 NSmut and MRKAd6-NSOITmut are shown in Figure 14. 

Example 13: Mice Immunization with Adenovectors Encoding Different NS 
Cassettes - 
The adenovectors Ad5-NS, MRKAdS-NSmut, MRKAd6-NSniut and 

15 MRKAd6-NSOPTmut were injected in C57Black6 mice strains to evaluate their 
potential to elicit anti-HCV immune responses. Groups of animals (N=9-10) were 
injected intramuscularly with 109 pp of CsCl purified vims. Each animal received two 
doses at three weeks interval. 

Humoral inmiune response against the NS3 protein was measured in 

20 , post dose two sera from C57Biack6 immunized mice by HLJSI A on bacterially 
expressed NS3 protease domain. Antibodies specific for the tested atitigen v/ere 
detected with geometric mean titers (GMT) ranging from 100 td 46000 (Tables 6, 7, 8 
and 9), 

25 Table 6: Ad5-NS 





GMT 


Mice n. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 




Titer 


50 


253 


50 


50 


50 


2257 


504 


50 


50 


50 


108 : 



30 
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Table?: Ad5-NSmiit 











GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


3162 


78850 


87241 


6796 


12134 


3340 


18473 


13093 


76167 


49593 


23645' 



Tables: MRKAd6-NSmut 









GMT 


Mice 
n. 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


125626 


39751 


40187 


65834 


60619 


69933 


21555 


49348 


29290 


26859 


46461 



Table 9: MRKAde-NSOPTmnt 









GMT 


Mice n. 


31 


32 


33 


34 


35 


36 


37 




Titer 


25430 


3657 


893 


175 


10442 


49540 


173 


2785 



T cell response in C57Black6 mice was analyzed by the quantitative 
EUSPOT assay measuring the number of IFNy secreting T cells in response to five 
pools (named from F to L+M) of 20mer peptides overlapping by ten residues 
encompassing the NS3-NS5B sequence. Specific CD8+ response induced in 
C57Black6 mice was analyzed by the same assay using a 20mer peptide 
encompassing a CD8+ epitope for C57Black6 mice (pepl480). Cells secreting IFNy 
in an antigen specific-manner were detected using a standard EUspot assay. 

Spleen cells, splenocytes and peptides were produced and treated as 
described in Example 3, supra. Representative data from groups of C57Black6 mice 
(N=9-10) immunized with two injections of 10^ viral particles of vectors Ad5-NS, 
MRKAdS-NSmut and MRKAd6-NSmut are shown in Figure 15, 

Example 14: Immu nization of Rhesus macaques with Adenovectors 

Rhesus macaques (N=3-4) were immunized by intramuscular injection 
of CsCl purified Ad5-NS, MRKAdS-NSmut, MRKAd6-NSmut or MRKAd6- 
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NSOPTmut virus. Each animal received two doses of 10^^ or 10'° vp in the deltoid; 

muscle at 0^ and 4 weeks. 

CM[ was measured at different time points by a) IFN-yEI^ 

Example 3. mpra), b) IFN-Y ICS and c) bulk GTL assays. These assays measure HCY 
5 antigen-specific CD8+ and CD4+ T lymphocyte responses, and can be used for a 

variety of mammals, such as hunians, rhesus monkeys, mice ^ and rats. 

The use of a specific.peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPQT assays and 

interferon-ganmia intracellular staining assays. Peptides based on the amino acid 
10 sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5a, NS5b) were 

prepared for use in these assays to measure immune responses in HCV DNA and 

adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 

The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large: :; 

pools of peptides can be used to detect an overall response to HCV proteins while 
15 smaller pools and individual peptides may be used to define the epitope specificity of 

aresponsev 

IFN-ylCS 

For IFN-Y ICS* 2 x 106 PBMC in 1 nil RIO (RPMI medium, 
20 supplemented with 10% PCS) were stiniuiated with peptide pool antigens. Final 

coTtcetitratiori of each peptide was 2 |i.g/ml. Cells were incubated for 1 hour in a GOi 
incubator at 37X and then Brefeldiii A was added to a final concentration of 10 ^ig 
/ml to inhibit the secretion of soluble cytokines. Cells were incubated for additional : 
14-16 hours at 37^C. 

25 Stimulation was done iii the presence of cd-stimulatory antibodies: 

CD28 arid CD49d (anti-humanCD28 BD340975 and anti-huin^Gp49d BI)340976): 
After incub'ation, cells were stained with fluorochrome-conjugated antibodies for 
surface antigens: anti-CD3, ariti-CD4i ariti-CD8 (CD3-APC Biosource APSOSOl, . 
Cp4.PE BD345769, GDS-PerCP BD345774). 

30 To detect intracellular cytokines, ceUs were treated with FACS 

permeabilization buffer 2 (BD340973), 2x final concentration. Once fixed and 
permeabilized, cells were incubated with an antibody against human IFNrY, IFN- 
YFTTC (Biosource AHC4338). 

Cells were resuspended in 1% formaldehyde in PBS and analyzed at 

35 FACS within 24 hours. Four color FACS analysis was performed on a FACSCalibur 
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instrument (Becton Dickinson) equipped with two lasers. Acquisition was done 
gating on the lymphocyte population in the Forward versus Side Scatter plot coupled 
with the CDS, CDS positive populations. At least 30,000 events of the gate were 
taken. The positive cells are expressed as number of EFN-y expressing cells over 10^ 
5 lymphocytes. 

IFN-Y EUSPOT and IFN-y ICS data from inmiunized monkeys after 
one or two injections of 10^^ or 10* ^ vp 6f the different adenovectora are reported in 
Figures 16A-16D, 17A, and 17B. 

10 Bulk CTL Assays 

A distinguishing effector function of T lymphocytes is the ability of 
subsets of this cell population to directly lyse cells exhibiting appropriate MHC- 
associated antigenic peptides. This cytotoxic activity is most often associated with 
CD8+ T lymphocytes. 

15 PBMC samples were infected with recombinant vaccine viruses 

expressing HCV antigens in vitro for approximately 14 days to provide antigen 
restimiilation and expansion of memory T cells. Cytotoxicity against autologous B 
cell lines treated with peptide antigen pools was tested. 

The lytic function of the culture is measured as a percentage of specific 

20 lysis resulted from chromium released from target cells during 4 hours incubation 

with CTL effector cells. Specific cytotoxicity is measured and compared to irrelevant 
antigen or excipient-treated B cell lines. This assay is semi-quantitative and is the 
preferred means for determining whether CTL responses were elicited by the vaccine. 
Data after two injections from monkeys immunized with 10^^ vp/dose with 

25 adenovectors Ad5-NS, MRKAdS-NSmut and MRKAd6-NSmut are reported in 
Figures 18A-18F. 

Other embodiments are within the following claims. While several 
embodiments have been shown and described, various modifications may be made 
30 without departing from the spirit and scope of the present invention. 
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WHAT IS CLAIMED IS: 

1 « A nucleic acid comprising a nucleotide sequence encoding ai 
Met.NS3-NS4A-NS4B-NS5A-NS5B polypeptide substanti^ly similar to SEQ ID 
5 NO: 1, provided that said polypeptide has sufficient protease activity to process itself 
to produce an NS5B protein and S2iid NS5B protein is enzyniaticially inactive. 

2. The nucleic acid of claim 1, wherein S£ud nucleotide sequence 
is substantially similar to the coding sequence of SEQ ID NO: 2, 

10 

3. The nucleic acid of claim 1, wherein said nucleotide sequence 
encodes for the polypeptide of SEQ ID NO: L 

4. The nucleic acid of claim 3, wherein said nucleotide sequence . 
15 is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: lO, pr 

SEQIDNO: IL 

5. The nucleic acid of claim 3, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2 or SEQ ID NO: 3. , 

20- : . , -v • . : ^ ' } " • 

6. The nucleic acid of any one of claims 1-5, wherein said nucleic 
acid is an expression vector capable of expressing said polypeptide.frdm said 
nucleotide sequence iri a human cell. 

25 7. A nucleic acid comprising a gene expression cassette able to ' 

express a Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide substantially similar to ' 
SEQ ID NO: 1 in a human cell, provided that said polypeptide can process itsblf to 
produce an NS5B protein and said NSSB protein is enzymatically inactive, said 
expression cassette comprising: 

30 a) a promoter transcriptionally coupled to a nucleotide sequence 

encoding said polypeptide; 

b) a 5' ribdsdme binding site fuiictiohally coupled to said nucleotide < 

sequence. 
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c) a terminator joined to the 3* end of said nucleotide sequence, and 

d) a 3* polyadenylation signal functionally coupled to said nucleotide 



sequence. 



^ 8, The nucleic acid of claim 7, wherein said nucleotide sequence 

is substantially similar to either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 
SEQIDNO:ll. 

9. The nucleic acid of claim 8, wherein said nucleic acid is a 

10 shutUe va:tor further comprising a selectable marker, an origin of replication, a first 
adenovirus homology region and a second adenovirus homology region flanking said 
expression cassette, wherein said first homology region has at least about 100 base 
pairs substantially homologous to at least right end of a wild-type adenovirus region 
from about base pairs 1-425, and said second homology region has at least about 100 

15 base pairs substantially homologous to at least the left end of a wild-type adenovirus 
region from about base pairs 35 1 1-5792 of Ad5 or corresponding region of another 
adenovirus. 

10. The nucleic acid of claim 9, wherein said nucleotide sequence 
20 encodes for a polypeptide of SEQ ID NO: 1. 

11. The nucleic acid of claim 9, wherein said nucleotide sequencer 
is SEQ ID NO: 2. 

12- The nucleic acid of claim 9, wherein said nucleotide sequence 
is either SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

13. The nucleic acid of claim 8, wherein said nucleic acid is a 
plasmid suitable for administration into a human and further comprises a prokaryotic 

30 origin of replication and a gene coding for a selectable marker. 

14. The nucleic acid of claim 13, wherein said nucleotide sequence 
encodes for a polypeptide of SEQ ID NO: 1 . 
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15. The nucleic acid of claim 14, wherein said nucleotide sequence 
IS the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ JD NO: 10, or , 
SEQIDNO: 11. 

5 16. The nucleic acid of claim 14, wherein said nucleotide sequence 

is the coding sequefice of SEQ ID NO: 2 or SEQ JOD NO: 3. 

17. The nucleic acid of claim 14, wherein said promoter is the 
human intermediate early cytomegalovirus promoter (intron A), said 5* ribosome 

10 binding site consists of SEQ ID NO: i2, and said 3* polyadenylation is the bovine 
growth hdrnlone (BGH) p^^ 

18. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 

15 and a recombinant aderidvectdr genome containing an El deletion, an E3 deletion, 
and said ex^jiression cassette. 

19. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker^ an origin of replication, 

20 and 

a) a first adenovirus region frorri about base pair 1 to abdiit bise 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El ahti-parallel 
orientation joined to said first region; 

25 c) a second adenovirus region from about base pair 351 1 to abqut . 

base pair 5548 cdfrespdndirig to Ad5 or from abdut base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about ; 
base pair 28133 corresponding to Ad5 or from about base pair 55421 to about base pair 

30 28156 corresponding td Ad6, joined td said second region; 

e) a fourth adenovirus region from abdiit bais^e pair 3081 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 cdrrespohditig to Ad6, joined to said third region; and 
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0 a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region. 

^ 20. The nucleic acid of claim 19, wherein said first region 

corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

21. The nucleic acid of claim 20, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

22. The nucleic acid of claim 21, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

23. The nucleic acid of claim 19, wherein said first region 

20 corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

24. The nucleic acid of claim 23, wherein said promoter is the 
25 human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 

consists of SEQ ID NO: 12, and said 3* polyadenylation is the BGH polyadenylation 
signal. 

25. The nucleic acid of claim 24, wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: II. 

26. The nucleic acid of claim 24, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

35 2 or SEQ ID NO: 3, 
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21. The nucleic acid of claim 8i wherein said nucleic acid is a 
adehoviiiis genome pi asmid conipnsing an origin of rep^^ 
aiiid: 

5 a) a flrst adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region fiom about base pair 351 1 to about 
baise pair 5548 corresponding to Ad5 or from sibout base pair 3508 to about biase pair 
5541 cdrresponding to Ad6, joined to said first region; 
10 c) a third adenovirus region from about base pair 5549 to about 

base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base paly 
28 156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parailiel 
orientation joined to said third region; 
15 e) a fourth adenovirus region fix^iii about base pair 308 18 to ^ 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
20 pair 35759 corresponding to Ad6 Joined to said fourth region^ 

28. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region correspbrids to Ad5, and said fifth region 

25 corresponds to Ad5. 

29. The nucleic acid of claim 28* wherein said promoter is the 
human intermediate early cytomegialovifUs promotier, said 5' ribosonie binding site 
consists of SEQ ID NO: 12, and ssiid 3' polyadenylatidn is the BGH polyadenylatioh 

30 signal. 

30. The iiuGleic acid of claim 27, wherein said first region 
corresponds to Ad5 or Ad6, said second region cdfresponds to Ad5 of Ad6, said third 
region corresponds to Ad6, said fourth region cdrtcsponds to Ad6, and said fifth , 

35 region corresponds to Ad5 or Ad6. 
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31- The nucleic acid of claim 30, wherein said promoter is the 
human intennediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
5 signal. 

32. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovector consisting of a nucleotide sequence substantially similar to of SEQ ID 
NO. 4 or a derivative thereof, wherein said derivative thercof has the HCV 

10 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

33. The nucleic acid of claim 8, wherein said nucleic acid is an 

15 adenovector having an adenovector genome containing an El deletion, an E3 deletion, 
and said expression cassette 

34. The nucleic acid of claim 8, wherein said nucleic acid is an 
adenovector consisting of: 

20 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

c) a second adenovirus region from about base pair 35 1 1 to about 
25 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

30 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

35 pair 35759 corresponding to Ad6, joined to said fourth region. 



66 



WQ 03/031588 



PCT/US02/32512 



35. The nucleic acid of claim 34, wherein smd first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to AdS, said fourth region corresponds to AdSVand said fifth region 

5 corresponds to Ad5. 

36. The nucleic acid of claim 35, whefein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosbme binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylatipft 

10 signal. 

37. The nucleic acid of claim 36, wherein said expressiori cassette 
is in aii El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1. 

15 ■ 

38. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
regibii corresponds to AdS^ said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

20 

39. The nucleic acid of claim 37, where said promoter is the human 
intermediate early cytomegalovirus promoter, said 5' ribosome binding site consists . 
of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation signal. 

25 40. The nucleic acid of claim 39, wherein said expression cassj^tte 

is in an El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2, 
SEQ lb NO: 3. SEQ ID NO: 10, or SEQ ID NO: 1 L 

41. The riUdeic acid of claim 39, wherein said expression cassette 
30 is in aii El anti parallel orientation and said nucleotide sequence is SEQ BP NO: 2 of 

SiBQIDN0:3. 

42. The niicleic acid of claim 8, wherein said nucleic acid is an . 
adenovector consisting of: 
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a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region froni about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5 5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28150 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
0 orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base : 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region. 

43. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

44. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

45. An adenovector consisting of the nucleic acid sequence of SEQ 
ID NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

30 polyprotein encoding sequence present in SEQ ED NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

46. An adenovector produced by a process comprising the steps of: 
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a) producing an adenovirus genome plasmid by homologous 
recombination between the shuttle vector of claim 9 and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
5 a second adenovirus region from about base pair 351 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

a third adenovinis region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
10 28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adeAovinis region from about base pair 33967 tp about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6 Joined to said fourth region; and 

b) rescuing said adenovector from said adenovirus plasmid. 



47. A cultured recombinant cell comprising the nucleic acid of 



20 claim 6. 



48. A cultured recombinant cell comprising the nucleic acid of aiiy:, 
one of claims 9-46. 

25 49i A method of making an adenovector comprising the steps of: 

a) producing an adendvims genome plasmid comprising a gene , 
expression cassette by homologous recombination between the nucleic acid of cljaiiri'9 
and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
30 pair 450 corresponding to either Ad5 or Ad6; 

a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or froin about base pair 3508 to about base paij . 
5541 corresponding to Ad6, joined to said first region; 
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a third adenovirus region from about base pair SS49 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 coiresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 308 1 8 to about 
5 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 conesponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; and 
10 b) rescuing said recombinant adenovirus from said recombinant 

adenovirus plasmid. 

50. A pharmaceutical composition comprising the nucleic acid of 
any one of claims 13-17 and 32-46 and pharmaceutically acceptable carrier. 

.15 ■ . • 

51. A method of treating a patient comprising the step of 
administering to said patient an efifective amount of the nucleic acid of any one of 
claims 13-17 and 32-46. 

20 52. Themethodof claim 51, wherein said patient is a human. 

53. The method of claim 52, wherein said patient is not infected 

with HCV. 

25 54. The method of claim 52, wherein said patient is infected with 

HCV, 

55. A recombinant nucleic acid comprising one or more Ad6 
regions and a region not present in Ad6, wherein at least one Ad6 region is selected 

30 from the group consisting of: El A, ElB, E2B, E2A, E4, LI, L2, L4, and L5. 

56. The recombinant nucleic acid of claim 55, wherein said region 
not present in Ad6, is an expression cassette coding for a polypeptide not found in 
Ad6. 

35 
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57. The recombinant nucleic acid of claim 56, wherein said 
recombinant nucleic acid is an adenovirus vector defective in at leaist El that is able to 
replicate when El is supplied in trans. 

5 58. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in an El parallel or El anti- 
10 parallel orientation joined to said first region; 

c) a second adenovirus region from about base pair 35 1 1 to aboufc 
base pair 5548 corresponding to Ad5 or fronr about base pair 3508 to about base pair 
5541 cdriesponding to Ad6, joined to said gene expression cassette; 

d) a third adenovirus region from about base pair 5549 to about ; 
15 base pair 28133 corresponding to Ad5 or from about base pair 5542 to abbut base pair 

28 156 corresponding to Ad6, joined to said second region; 

e) an optionally present fourth region from about base pair 28134 
to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about 30789 corresponding to Ad6 Joined to said third region; 

20 f) a fifth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or firom about base pair 30789 to about base ' 
pair 33784 corresponding to Ad6, wherein said fifth region is joined to said fourth ^{ 
region if said fourth region is present, or said fifth is joined to said third region if said 
fourth region is not present; and 

25 g) a sixth adenovirus region from about base pair 33967 to about 

base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base > 
pair 35759 corresponding to Ad6, joined to said fourth region; ' 

provided that at least one of said second, third, and fifth regions is 

from Ad6. 



30 



59. The recombinant nucleic acid of claini 57, wherein said vecifbf 

consists of: 

a) a first adenovirus region from about base pair 1 to about base ' 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 35 11 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base p^r 
28 1 56 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-paralle^l 
orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base , 

pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or fiom about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to said fourth region; 

provided that at least one of said second, third, and fourth regions is 

fiomAd6. 
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1 


MAPITAYSQQ. 


51 


GVCWTVYHGA 


101 . 


: GSSI3LYLVTR. 


151 


AVGIFRAAVC 


201 


AHLHAPTGSG 


251 


PNIRTGVRTI 


301 


ILGIGTVLDQ 


351 


YGKAIPIEAI 


401 


. iPTIGDWW 


451 


TVPQDAVSRS 


501 


AWYELTPAET 


551 


TKQAGDNFPY 


601 


YRLGAVQNEV 


651 


TTGSWIVGR 


701 


EQFKQKALGL 


751 


AGiiSTLPGNP 


801 


ASAFVGAGiA 


851 


TEDLVNLLPA 


901 


GNHVSPTHYV 


951 


WLRJDVWDWIG 


1001 


QTTCPCGAQI 


1051 


pir^siwuiiWRV 


llOl 


DGVRLHRYAP 


1151 


TDPSHITAET 


1201 


LIEANLLWRQ 


1251 


RKSKKFPAAM 


1301 


PPRRKRTVVL 


1351 


GDKGSDVESY 


1401 


TGALITPCAA 


1451 


LQVLDDHYRD 


1501 


DVRNLSSKAV 


1551 


ARLIVFPDLG 


1601 


TWKSKKNPMG 


1651 


TERLYlGGPL 
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TRGLLGCIIT SLTGRDKNQV- 
GSKTLAGPKG PITQMYTNVD 
HADVIPVRRR GDSRGSLLSP 
TRGVAKAVPF VPVESMETTM. 
KSTKVPAAYA AQGYKVLVLN 
TTGAPVTYST YGKFLADGGC 
AETAGARLW LATATPPGSV 
RGGRHLIFCH SKKKCDELAA 
ATDALMTGYT GDFDSVIDCN 
QRRGRTGRGR RGIYRFVTPG 
SVRLRAYLNT PGLPVCQDHL 
LVAYQATVCA RAQAPPPSWD 
TLTHPITKYI MACMSADLEV 
IILSGRPAIV PDREFLYQEF 
LQTATKQAEA AAPWESKWR 
AIASLiyiAFTA SITSPLTTQS 
GAAVGSIGLG KVLVDILAGY 
ILSPGALWG WCAAILRRH 
PESDAAARVT QILSSLTITQ 
TVLTDFKTWL QSKLLPQLPG 
TGHVKNGSMR IVGPKTCSNT 
AAEEYVEVTR VGDFHYVTGM 
ACRPLLREEV TFQVGLNQYL 
AKRRLARGSP PSLASSSASQ 
EMGGNITRVE SENKVWLDS 
PIWARPDYNP PLLESWKDPP 
TESSVSSALA ELATKTFGSS 
SSMPPLEGEP GDPDLSDGSW 
EESKLPINAL SNSLLRHHNM 
VLKEMKAKAS TVKAKLLSVE 
NHIHSVWKDL LEDTVTPIDT 
VRVCEKMALY DWSTLPQW 
FSYDTRCFDS TVTENDIRVE 
TNSKGQNCGY RRCRASGVLT 
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EGEVQVVSTA tqsflatcvn 
QDLVGWQAPP GARSLTPOTC 
RPVSYLKGSS GGPLLCPSGH 
RSPVFTDNSS PPAVPQSFQV 
PSVAAf LGFG > AYMSKAHGIP 
SGGAYDIXiC . DECHSTDSTT . 
TVPHPNIEEV ALSNTGEIPF 
KLSGLGINAV AYYRGLDVSV 
TCVTQTVDFS LDPTFTIETT 
ERPSGMFDSS VLCECYDAGC 
EFWESVFtGIi THIDAHFLSQ 
QMWKCLIRLK PTLHGPtPI-L 
VTSTWVLVGG VLAAlAAYCL 
DEMEECASHL PYIEQGMQLA 
ALETFWAKHM WNFISGIQYL 
TLLiilSriiiGGW VAAQLAPPSA 
GAGVAGAiiVA FKVMSGEMPS 
VGPGEGAVQW MNRLIAFASR 
LLKRLHQWIN EDCSTPCSGS 
VPFFSCQRGY KGVWRGPGlM 
WHGTFPINAY TTGPCTPSPA 
TTDiJVKCPCQ VPAPEFFTiSV 
VGSQLPCEPE PDVAVLTSMii 
LSAPSLKATC TTHHVSPDAD 
FDPLRAEEDE REVSVPAEIL 
YVPPWHGCP LPi>IKAPPiP 
ESSAVDSGTA TALPDQASpb 
STVSEEASED vvccsmsytw 

VYATTSRSAG LRQKKVTFDR 
EACKLTPPHS AKSKFGYGAk 
TIMAKNEVFC VQPEKGGRKP 
MGSSYGFQYS PGQRVEFLVN 
ESIYC3CCDLA PEARibAlkSL 
TSCGNTLTCY LKASAACRAA 
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1701 KLQDCTMLVN 

1751 EYDLELITSC 

1801 SWLGNIIMYA 

1851 LDLPQIIERL 

1901 VRARLLSQGG 

1951 GGDIYHSLSR 
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AAGLWICES AGTQEDAASL 
SSNVSVAHDA SGKRVYYLTR 
PTLWARMILM THFPSILLAQ 
HGLSAFSLHS YSPGEINRVA 
RAATCGKYLF NWAVKTKLKL 
ARPRWFMLCL LLLSVGVGIY 



RVFTEAMTRY SAPPGDPPQP 
DPTTPLARAA WETARHTPVN 
EQLEKALDCQ lYGACYSIEP 
SGLRKLGVPP LRVWRHRARS 
TPIPAASQLD LSGWFVAGYS 
LLPNR 
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1 GCCACCATGG CGCCGATCAG (3GCCTAGTCC CAACAGACGC GGGGCCTACT . 

51 TGGTTGCATC ATCACTAGGC TTACAGGCCG GGACAAGAAG CAGGTCGAGG 

101 GAGAGGTTCA GGTGGTTTCC AeCGCAACAC AATCCTTCCT GGCGACCTGC 

151 GTCAACGGCG TGTGTTGGAC CiSTTTACCAT GGTGCTGGCT CAAAGACCTT 

201 AGCCGGCCCA AAGGGGCCAA TCACCCAGAT GTACACTAAT GTGGACCAGG 

251 ACCTCGTCGG CTGGCAGGCG CCCCCGGGGG CGCGTTGCTT GACACCAa?GC 

301 ACCTGTGGCA GCTCAGACCT TTACTTGGTC ACGAGACATG CTGACGTCAT 

351 TCCGGTGCGC CGGCGGGGCG ACAGTAGGGG GAGCCTGCTG TCCCCCAGGC 

401 CTGTCTCCTA CTTGAAGGGC TCTTCGGGTG GTCCACTGCT CTGCCCTTCG 

451 GGGCACGCTG TGGGCATCTT CCGGGCTGCC GTATGGACCG GGGGGGTTGC 

.501 GAAGGGGGTG GAGTTTGTGC CCGTAGAGTG CATGGAAACT ACfATGCGGT 

551 CTCGGGTGTT CACGGACAAC TCATCCCCCC CGGGCdTAGC GCAGTCATTT 

601 CAAGTGGCGO ACCTACACGG TCCCACTGGC AGCGGCAAGA GTACTAAAGT 

651 GCGGGCTGCA TATGCAQCCC AAGGiSTACAA GGTGCTCGTC CTCT^TCCGT 

701 GGGTTGCCGC TAGCTTAGGG TTTGGGGCGT ATATGTGTAA GGCACACGGT 

75i ATTGACCCCA ACATGAGAAC 1K3GGGTAAGG ACCATTACCA CAGGCGCCCC 

801 dGtCACATAC TCTACCTATG GCAAGTTTCT TGGCGATGCST GK3TTGCTCTG 

851 GGGGCGGTTA TGACATCAtA ATATGTGATG AGTGCGATTC AACTGACTCG 

901 ACTACAATCT TGGGGATCGG CACAGTGCTG GACCAAGGIGG AGACGGCTGG 

951 AGGGGGiGCTT GTGGTGCTGG CCACCGGTAC GCGTGGGGGA TGGGTGACGG 

1001 TGGCAGAGCC AAAGATGGAG GAGGTGGGGG TGTCTAATAG TGGAGAGATG 

1051 CGGtTGTATG GGAAAGCGAT GGCGATTGAA GCCATCAGGG GGGGAAGGCA 

ildl TCTCATTTTC TGTCATTCCA AGAAGAAGTG CGACGAGCTC GCCGCAAAGC 

li5i TGTCAGGCCT CGGAATCAAC GCTGTGGCGT ATTACCGGGG GCTCGATGTG 

1201 TCCGTCATAG GAACTATCGG AGACGTCGTT GTCGTGGCAA CAGACGCTCT 

1251 GATGACGGGC TATACGGGCG ACTTTGACtC AGTGATCGAC TGTAACACAT 

iioi GTGTCACCCA GACAGTCGAC TTCAGCTTGG ATGCCACCTT CACCATTGAG . 

1351 ACCACGACCG TGCCTCAAGA CGCAGTGTCG CGCTCGCAGC GGCGGGGTAG 

i^Ol GACTGGCAGG GGTAGGAGAG GGATGTAGAG GTTTGTGAGT CCGGGAGAAC 

1451 GGGGCTCGGG GATGTTGGAT TGGtCGGTCG TGTGTGAGTG CTATGAGGCG 

1501 GGCTGTGGTT GGTAGGAGCT CACGGCGGCG GAGACCTCGG TTAGGTTGCG 

1551 GGGGTACCTG AAGACACCAG GGIT^GCGGGT TTGCCAGkSAC CAGCTGGAGT 

isol TCTGGGAGAG TGTCTTCACA GGGCTCACCG agatagatgc acacttcttg 

1651 TCCCAGACCA AGCAGGCAGG AGACAACTTC CCCTACCTGG TAGCATACCA 
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1701 AGCCACGGTG TGCGCCAGGG CTCAGGCCCC ACCTCCATCA TGGGATCAAA 

, 3,751 TGTGGAAGTG TCTCATACGG CTGAAACCTA CGCTGCACGG GCCAACACCC 

1801 TTGCTGTACA GGCTGGGAGC CGTCC^^AAAT GAGGTCACCC TCACCCACCC 

1851 CATAACCAAA TACATCATGG CATGCAT6TC GGCTGACCTG GAGGTCGTCA 

1901 CTAGCACCTG GGTGCTGGTG GGCGGAGTCC TTGCAGCTCT GGCCGCGTAT 

1951 TGCCTGACAA CAGGCAGTGT GGTCATTGTG GGTAGGATTA TCTTGTCCGG 

2001 GAGGCCGGCT ATTGTTCCCG ACAGGGAGTT TCTCTACCAG GAGTTCGATG 

2051 AAATGGAAGA GTGCGCCTGG CACCTCCCTT ACATCGAGCA GGGAATGCAG 

2101 CTCGCCGAGC AATTCAAGCA GAAAGCGCTC GGGTTACTGC AAACAGCCAC 

2151 CAAACAAGCG GAGGCTGCTG CTCCCGTGGT GGAGTCCAAG TGGCGAGCCC 

2201 TTGAGACATT CTGGGCGAAG CACATGTGGA ATTTCATCAG CGGGATACAG 

2251 TACTTAGCAG GCTTATCCAC TCTGCCTGGG AACCCCGCAA TAGCATCATT 

2301 GATGGCATTC ACAGCCTCTA TCACCAGCCC GCTCACCACC CAAAGTACCC 

2351 TCCTGTTTAA CATCTTGGGG GGGTGGGTGG CTGCCCAACT CGCCCCCCCC 

2401 AGCGCCGCTT CGGCTTTCGT GGGCGCCGGC ATCGCCGGTG CGGCTGTTGG 

2451 CAGCATAGGC CTTGGGAAGG TGCTTGTGGA CATTCTGGCG GGTTATGGAG 

2501 CAGGAGTGGC CGGCGCGCTC GTGGCCTTCA AGGTCATGAG CGGCGAGATG. 

2551 CCCTCCACCG AGGACCTGGT CAATCTACTT CCTGCCATCC TCTCTCCTGG 

2601 CGCCCTGGTC GTCGGGGTCG TGTGTGCAGC AATACTGCGT CGACACGTGG 

2651 GTCCGGGAGA GGGGGCTGTG CAGTGGATGA ACCGGCTGAT AGCGTTCGCC 

2701 TCGCGGGGTA ATCATGTTTC CCCCACGCAC TATGTGCCTG AGAGCGACGC 

2751 CGCAGCGCGT GTTACTCAGA TCCTCTCCAG CCTTACCATC ACTCAGCTGC 

2801 TGAAAAGGCT CCACCAGTGG ATTAATGAAG ACTGCTCCAC ACCGTGTTCC 

2851 GGCTCGTGGC TAAGGGATGT TTGGGACTGG ATATGCACGG TGTTGACTGA 

2901 CTTCAAGACC TGGCTCCAGT CCAAGCTCCT GCCGCAGCTA CCGGGAGTCC 

2951 CTTTTTTCTC GTGCCAACGC GGGTACAAGG GAGTCTGGCG GGGAGACGGC 

3001 ATCATGCAAA CCACCTGCCC ATGTGGAGCA CAGATCACCG GACATGTCAA 

3 051 AAACGGTTCC ATGAGGATCG TCGGGCCTAA GACCTGCAGC AACACGTGGC 

3101 ATGGAACATT CCCCATCAAC GCATACACCA CGGGCCCCTG CACACCCTCT 

3151 CCAGCGCCAA ACTATTCTAG GGCGCTGTGG CGGGTGGCCG CTGAGGAGTA 

3201 CGTGGAGGTC ACGCGGGTGG GGGATTTCCA CTACGTGACG GGCATGACCA 

3251 CTGACAACGT AAAGTGCCCA TGCCAGGTTC CGGCTCCTGA ATTCTTCACG 

3301 GAGGTGGACG GAGTGCGGTT GCACAGGTAC GCTCCGGCGT GCAGGCCTCT 

3351 CCTACGGGAG GAGGTTACAT TCCAGGTCGG GCTCAACCAA TACCTGGTTG 
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3401 

3451 

3561 

3S51 

3601 

3651 

3701 

3751 

3801 

3851 

3901 

3951 

4001 

4051 

410.1 

4151: 

4201 

4251 

4301 

4351 

4401 

4451 

4501 

4551 

4601 

4651 

4701 

4751 

4801 

4851 

4901 

4951 

5001 

5051 



GGTCACAGCT ACCATGCGAG CCCGAACGGG ATGTAGCAGT GCTCAGTTCe 
ATGCTCACCG ACCCCTCCCA CATCACAGCA GAAACGGCTA AGCGTAGGTT 
GGCCAGGGGG TCTCCCCCCT CCTTGGCCAG CTCTTCAC?CT AGCCAGTTGT 
CTGCGCCTTC CTTGAAGGCG ACATGCACTA CCCACCATGT CTCtCCGGAC 
GCTQACCTCA TCGAGGCCAA CCTCCTGTGG CGGCAGGA6A TGGGCGGGAA 
CATCACGCGC GTGGAGTGGG AGAACAAGGT GGTAGTGCTG GACTCTtTCG • 
ACGCGCTTCG AGCGGAGGAG GATGAGAGGG AAGTATCGGT TCCGGCGGAG 
ATGCTGCGGA AATCGAAGAA GTTCCGGGCA GCGATGGGGA TGTGGGCGCG 
CCCGGATTAC AACCCTGGAG TGTTAGAGTG GTGGAAGGAC CCGGACTACG 
TCCCTGGGGT GGTGCACGGG TGCCCGTTGC CACCTATCAA GGCCCCTCCA 
ATAGCAGCTC GACGGAGAAA GAGGAGGGTt GTCCTAACAG . A(JTCCTCCGT ; 
GTGTTCTGCG TTAGCGGAGC TCGCTACTAA GACCTTGGGC AGCtCCGAAT 
CATCGGGCGT CGACAGCGGG AGGGCGAGGG CGCTTCCTGA CGAGGCCTCG 
GACGAGGGTG AC/^GGATC GGACGTTGAG TCGTAGTGCT GCATGCCCCC 
CCTTGAGGGG GAACCGGGGG AGGGGGATGT CAQTGACGGG TGTTGGTCTA 
GCGTGAGCGA GGAAGCTAGT GAGGATGTGG TCTGGTGCTC AATGTGGTAC 
ACATGGACAG GCGCGTTGAT CACGGCATQC GCTGGGGAGG AAAGCAAGCT 
GCCGATCAAC GCGTTGAGCA ACTCTTTGCT GCGGCACCAT AACATGGTTT 
ATGCCACAAC ATCTCGCAGC GCAGGCCTGC GGCAGAAGAA GGTCACCTTT 
GACAGACTGC AAGTCCTGGA CGACCAGTAC CGGGACGTGC TCAAGGAGAT 
GAAGGCGAAG GCGTGGACAG TTAAGGGTAA ACTCCTATCC GTAGAGGAAG 
CCTGGAAGCT GACGGGCCCA CATTCGGCCA AATCGAAGTT TGGGTATGGG 
GCAAAGGAGG TCCGGAAGCT ATCCAGGAAG GCCGTTAAGG ACATGGACTC 
CGTGTGGAAG GACTTGGTGG AAGAGAGTGT GACACGAATT GACACGACCA 
TCATGGGAAA AAATGAGGTT TTCTGTGTCC AACCAGAGAA AC3GAGGGGGT 
AAGCCAGCCC GCGTTATCGT ATTGCCAGAT CTGGGAGTCC GTGTATGGGA 
GAAGATGGGC CTCTATGATG TGGTCTGGAC CCTTCGTCAG GTCGTGATGG 
GCTGGTGATA GGGATTCGAG TACTGTGGTG GGCAGCGAGT GGAGTTCCTG 
GTGAATACCT GGAAATCAAA GAAAAACCCC ATGGGCTTTT CATATGAGAC 
TCGCTGTTTG GACTCAACGG TCAGCGAGAA CGACATCCGT GTTGAGGAGT 
GAATTTACCA ATGTTGTGAC TTGGCCCGCG AAGCCAGACA GGCCTATAAAA 
TCGCTCACAG AGCGGCTTTA TATCGGGGGT CCTCTGACTA ATTGAAAAGG 
GCAGAAGTGC GGTTATGGGC GGTGCCGCGC GAGCGGCGTG CTGACGAGTA 
GCTGCGGTAA CACCCTCACA TGTTACTTGA AGGCCTCTGC ACCCTGTCGA 
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5101 


GCTGCGAAGC 


5151 


CGTTATCTGT 


5201 


TCTTCACGGA 


5251 


CAACCAGAAT 


5301 


GGTCGCCCAC 


5351 


CCACCACCCC 


5401 


GTTAACTCCT 


5451 


AAGGATGATT 


5501 


AACTTGAAAA 


5551 


GAGCCACTTG 


5601 


ATTTTCACTC 


5651 


GCCTCAGGAA 


5701 


AGGAGCGTCC 


5751 


TGGCAAGTAC 


5801 


CAATCCCGGC 


5851 


TACAGCGGGG 


5901 


. GTTCATGCTG 


5951 


TCCCCAACCG 
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TCCAGGACTG CACGATGCTC 
GAAAGCGCGG GAACCCAAGA 
GGCTATGACT AGGTACTCTG 
ACGACTTGGA GCTGATAACA 
GATGCATCAG GCAAAAGGGT 
CCTCGCACGG GCTGCGTGGG 
GGCTAGGCAA CATTATCATG 
CTGATGACTC ACTTCTTCTC 
AGCCCTGGAC TGCCAGATCT 
ACCTACCTCA GATCATTGAA 
CATAGTTACT CTCCAGGTGA 
ACTTGGGGTA CCACCCTTGC 
GCGCTAGGCT ACTGTCCCAG 
CTCTTCAACT GGGCAGTGAA 
TGCGTCCCAG CTGGACTTGT 
GAGACATATA TCACAGCCTG 
TGCCTACTCC TACTTTCTGT 
ATAAA 



GTGAACGCCG CCGGCCTTGT 
GGACGCGGCG AGCCTACGAG 
CCCCCCCCGG GGACCCGCCC 
TCATGTTCCT CCAATGTGTC 
GTACTACCTC ACCCGTGATC 
AAACAGCTAG ACACACTCCA 
TATGCGCCCA CTTTGTGGGC 
CATCCTTCTA GCACAGGAGC 
ACGGGGCCTG TTACTCCATT 
CGACTCCATG GCCTTAGCGC 
GATCAATAGG GTGGCTTCAT 
GAGTCTGGAG ACATCGGGCC 
GGG6GGAGGG CCGCCACTTG 
GACCAAACTC AAACTCACTC 
CCGGCTGGTT CGTTGCTGGT 
TCTCGTGCCC GACCCCGCTG 
AGGGGTAGGC ATCTACCTGC 



FIG. 2D 



Wtf03/0315«8' 



PCT/US02/32512 



7/92 

1 GCCACCATGG CCCCCATCAC CGCCTACAGC GAGCAGACCC GCGGCCTGCT 

51 GGGGTGCATC ATCACCAGCG TGAGCGGCCG GGAGAAGAAC CAGGTGGAGG 

101 GCGAG6TGCA GGTGGTGAGC ACGGCCAGCC AGAGCTTCCT GGCGAGCTGG 

l5i GTGAACGGCG TGTGCTGGAC CGTGTACCAC GGCGGCGGGA GGAAGXGGGT 

201 GGCCGGCCCC AAGGGCGGCA TCAC'CCAGAT GTACACCAAG GTGGACCAGG 

251 AGCTGGTGGG CTGGGAGGGC GCCCGCGGCG CCCGCAGCGT GACGGGGTGC 

301 ACCTGCGGCA GCAGCGAGCT GTACCTGGTG ACCCGCGACG CCGACGTGAT 

351 GGGGGTGCGG CGCCGCGGGG ACAGCCGCGG CAGGGTGCTG agcggccgcc 

401 GGGTGAGGTA CGTGAAGGGG AGGAGGGGCG GCCGGGTGGT GTGCGGGAGC 

451 (3GCGACGCGG TGCSGGATCTT GGGGGGCGGC GTGTGGAGCG GGGGGGTGGG 

501 CAAGGCGGtG GAGTTGGTGC GCGTGGAGAG GATGGAGACC AGCATGGGCA 

551 GCGCGGTGTT CACCGACAAC AGGAGGGGCG CGCSGCGTGGG CCKOkGQTTC 

601 CAGGTGGGGG ACGTGCACGC GCGGAGGGGC AGCGGGAAGA GGAGCAAGGT 

651 GCGCGGGGGG TACGCCGCCG AGGGCTACAA GGTGGTGGTG C*rGAAGGCGA 

701 GGGTGGGCGC GAGGCTGGGG TTGGGCGCCT ACATGAGCAA GGGGGAGGGG 

751 ATCGACCGCA ACATCCgCAG CGGCGTGCGC AGCATGACCA CCGGGGGCGG 

801 GGTGACCTAC AGCACCTAGG GCAAGTTGGT GGGGGAGGGG GGCTGGAGGG 

85 i GGGGGGGGTA CGAGATCATC ATGTGGGAGG AGTGGGACAG CACGGAGAGG 

901 ACGAGCATCC TGGGCATGGG CAGGGfGCTG GACGAGGGGG AGAGGGGCGG 

95i GGGCGGGGTG GTGGTCCTGG GGAGGGCCAC GGCGGGGGGG AGGGTGAGGG 

idol TGGCGGAGGG CAAGATCGAG GAGGTGGGGC TGAGGAAGAG CGGGGAGATG 

1051 CGGTTGTAGG GGAAGGGGAT GGGGATCGAG GGCATGGGGG GGGGGGGGCA 

ilOl GGTGATGTTG TGGGAGAGGA AGAAGAAGTG CGAGGAGGTG GGGGGGAAGC 

1151 tgaggggggt ggggatgaag ggggtggggt agtagggggg ggtggaggtg 

1201 agggtgatgg cgagcatggg cgaggtggtg gtggtggcga gggagggggt 

125.1 gatgaggggg tagagcgggg agttggagag ggtgatggag tggaagacct 

1301 gcgtgaggga gagggtggag ttgaggctgg aggggaggtt gacgatggag 

1351 aggaggaggg tgccccagga cgccgtgagc cgcagcgagc ggcggggggg 

1401 gaccggccgc ggccgccgcg gcatctacgg cttcgtgacc gccggcgagg 

1451 ggcccagcgg catgttggag agcagcgtgg tgtgggagtg gtacgagggc 

1501 gggtgcgcgt ggtacgaggt gacccccggg gagaccagcg tgcgcci^g 

1551 ggcctacctg aacagccgcg gggtgcgcgt gtgccaggac caggtggagt 

1601 tgtg<5gagag cgtcttcacg ggggtgacgg agatcgaggc Cgacttgctg 

i65l AGCCAGAGCA AGCAGGCdiSG CGACAAGTTG GGGTAGCTGG TGGGGTAGCSA 
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1701 


GGCCACCGTG 


1751 


TGTGGAAGTG 


1801 


CTGCTGTACC 


1851 


CATCACCAAG 


1901 


CCAGCACCTG 


1951 


TGCCTGACCA 


2001 


CCGCCCCGCC 


2051 


AGATGGAGGA 


2101 


CTGGCCGAGC 


2151 


CAAGCAGGCC 


2201 


TGGAGACCTT 


2251 


TACCTGGCCG 


2301 


GATGGCCTTC 


2351 


TGCTGTTCAA 


2401 


AGCGCCGCCA 


2451 


CAGCATCGGC 


2501 


CCGGCGTGGC 


2551 


CCCAGCACCG 


2601 


CGCCCTGGTG 


2651 


GCCCCGGCGA 


2701 


AGCCGCGGCA 


2751 


CGCCGCCCGC 


2801 


a?GAAGCGCCT 


2851 


GGCAGCTGGC 


2901 


CTTCAAGACC 


2951 


CCTTCTTCAG 


3001 


ATCATGCAGA 


3051 


GAACGGCAGC 


3101 


ACGGCACCTT 


3151 . 


CCCGCCCCCA 


3201 


CGTGGAGGTG 


3251 


CCGACAACGT 


3301 


GAGGTGGACG 


3351 


GCTGCGCGAG 
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TGCGCCCGCG CCCAGGCCCC 
CCTGATCCGC CTGAAGCCCA 
GCCTGGGCGC CGTGCAGAAC 
TACATCATGG CCTGCATGAG 
GGTGCTGGTG GGCGGCGTGC 
CCGGCAGCGT GGTGATCGTG 
ATCGTGCCCG ACCGCGAGTT 
GTGCGCGAGC CACCTGCCCT 
AGTTCAAGCA GAAGGCCCTG 
GAGGCCGCCG CCCCCGTGGT 
CTGGGCCAAG CACATGTGGA 
GCCTGAGCAC CCTGCCCGGC 
ACCGCCAGCA TCACCAGCCC 
CATCCTGGGC GGCTGGGTGG 
GCGCCTTCGT GGGCGCCGGC 
CTGGGCAAGG TGCTGGTGGA 
CGGCGCCCTG GTGGCCTTCA 
AGGACCTGGT GAACCTGCTG 
GTGGGCGTGG TGTGCGCCGC 
GGGCGCCGTG CAGTGGATGA 
ACCACGTGAG CCCCACCCAC 
6TGACCCAGA TCCTGAGCAG 
GCACCAGTGG ATCAACGAGG 
TGCGCGACGT GTGGGACTGG 
TGGCTGCAGA GCAAGCTGCT 
CTGCCAGCGC GGCTACAAGG 
CCACCTGCCC CTGCGGCGCC 
ATGCGCATCG TGGGCCCCAA 
CCCCATCAAC GCCTACACCA 
ACTACAGCCG CGCCCTGTGG 
ACCCGCGTGG GCGACTTCCA 
GAAGTGCCCC TGCCAGGTGC 
GCGTGCGCCT GCACCGCTAC 
GAGGTGACCT TCCAGGTGGG 



CCCCCCCAGG TGGGACCAGA 
CCCTGCACGG CCCCACCCCC 
GAGGTGACCC TGACCCACCC 
CGCCGACCTG GAGGTGGTGA 
TGGCCGCCCT GGCCGCCTAC 
GGCCGCATCA TCCTGAGCGG 
CCTGTACCAG GAGTTCGACG 
ACATCGAGCA GGGCATGCAG. 
GGCCTGCTGC AGACCGCCAC 
GGAGAGCAAG TGGCGCGCCC 
ACTTCATCAG CGGCATCCAG 
AACCCCGCCA TCGCCAGCCT 
CCTGACCACC CAGAGCACCC 
CCGCCCAGCT GGCCCCCCCC 
ATCGCCGGCG CCGCCGTGGG 
CATCCTGGCC GGCTACGGCG 
AGGTGATGAG CGGCGAGATG 
CCCGCCATCC TGAGCCCCGG 
CATCCTGGGC CGCCACGTGG 
ACCGCCTGAT CGCCTTCGCC 
TACGTGCCCG AGAGCGACGC 
CCTGACCATC ACCCAGCTGC 
ACTGCAGCAC CCCCTGCAGC 
ATCTGCACCG TGCTGACCGA 
GCCCCAGCTG CCCGGCGTGC 
GCGTGTGGCG CGGCGACGGC 
CAGATCACCG GCCACGTGAA 
GACCTGCAGC AACACCTGGC 
CCGGCCCCTG CACCCCCAGC 
CGCGTGGCCG CCGAGGAGTA 
CTACGTGACC GGCATGACCA 
CCGCCCCCGA GTTCTTCACC 
GCCCCCGCCT GCCGCCCCCT 
CCTGAACCAG TACCTGGTGG 
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GCCCTGCGAG CCCGAGCCCG 
ACCCCAGCCA CATCAGCGCC 
AGCCCCCCCA GCCTGGGCAG 
CCTGAAGGGC ACCTGCACGA 
TCGAGGCCAA CCTGCTGTGG 
GTGGAGAGCG AGAACAAGGT 
CGCCGAGGAG GACGAGGGCG 
AGAGCAAGAA GTTCCCCGCG 
AACCCCGCCC TGCTGGAGAG 
GGTGCACGGC TGCCCCGTGC 
CCCGCGGCAA GCGCACCGTG 
GTGGCCGAGC TGGCCACCAA 
GGACAGGGGC ACCGCCACCG 
ACAAGGGCAG CGACGTGGAG 
GAGCCCGGCG ACGCCGAGCT 
GGAGGCCAGC GAGGACGTGG 
GCGCCGTGAT CACCCCCTGC 
GCCCTGAGCA ACAGCCTGCT 
CAGCCGCAGC GCCGGCCTGC 
AGGTGCTGGA CGACCACTAC 
GCCAGCACCG TGAAGGCCAA 
GACCCCCCCC CACAGCGCCA 
TGCGCAACCT GAGCAGCAAG 
GACCTGCTGG AGGACACCGT 
GAACGAGGTG TTCTGCGTGC 
GCCTGATCGT GTTCCCCGAC 
CTGTACGACG TGGTGAGCAC 
CGGGTTCCAG TACAGCCCCG 
GGAAGAGCAA GAAGAACCCC 
GACAGCACCG TGACCGAGAA 
GTGCTGCGAC CTGGCCCCCG 
AGCGCCTGTA CATCGGCGGC 
GGCTACCGCC GCTGCCGCGC 
CACCCTGACC TGCTACCTGA 

FIG. 3C 



ACGTGGCCGT GCTGAGCAGC 
GAGACCGCCA AGCGCCGCCT 
GAGCAGCGCC AGCCAGCTGA 
CCCACCACGT GAGCGCCGAG 
CGCCAGGAGA TGGGCGQCAA. 
GGTGGTGGTG GACAGCTTCG 
AGGTGAGGGT GCCCGCCGAG 
GCCATGGCGA TGTGGGGCGG 
GTGGAAGGAC CCCGACTACG 
CCCCCATCAA GGCCCGCCCC 
GTGCTGACCG AGAGCAGCGT 
GAGCTTCGGC AGCAGCGAGA \ 
CCCTGCGCGA CCAGiGCCAGC 
AGCTACAGCA GCATGGCCCG 
GAGCGACGGG AGCTGGAGCA 
TGTGGTGCAG GATGAGCTAC 
GCGGCCGAGG AGAGCAAGCT 
GCGCCAGCAC AACATGGTGT 
GCCAGAAGAA GGTGACCTTC 
CGCGACGTGC TiSAAGGAGAT 
GCTGCtGAGC GTGGAGGAGG 
AGAGCAAGTt ceGC^ACGGd 
GCCGTGAACC ACATCCACAG 
GACCCCCATC GACACCACCA 
AGCCCGAGAA GGGCGGCC(3C 
CTGGGCGTGC GCGTGTGCGA 
CCTGCCCCAG GTGGTGATGG 
GCCAGCGCGT GGAGTTCCTG 
ATGGGCTTCA GCTACGACAC 
CGACATCCGC GTGGAGGAGA 
AGGCCCGCCA GGCCATCAAG 
CCCCTGACCA ACAGCAAGGG 
CAGCGGCGTG CTGACCACCA 
AGGCCAGCGC CGCCTGCCGC 
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5101 GCCGCCAAGC TGCAGGACTG CACCATGCTG GTGAACGCCG CCGGCCTGGT 

5151 GGTGATCTGC GAGAGCGCCG GCACCCAGGA GGACGCCGCC AGCCTGCGCG 

5201 TGTTCACCGA GGCCATGACC CGCTACAGCG CCCCCCCCGG CGACCCCCCC 

5251 CAGCCCGAGT ACGACCTGGA GCTGATCACC AGCTGCAGCA GCAACGTGAG 

5301 CGTGGCCCAC GACGCCAGCG GCAAGCGCGT GTACTACCTG ACCCGCGACC 

5351 CCACCACCCC CCTGGCCCGC GCCGCCTGGG AGACCGCCCG CCACACCCCC 

5401 GTGAACAGCT GGCTGGGCAA CATCATCATG TACGCCCCCA CCCTGTGGGC 

5451 CCGCATGATC CTGATGACCC ACTTCTTCAG CATCCTGCTG GCCCAGGAGC 

5501 AGCTGGAGAA GGCCCTGGAC TGCCAGATCT ACGGCGCCTG CTACAGCATC 

/555|1 GAGCCCCTGG ACCTGCCCCA GATCATCGAG CGCCTGCACG GCCTGAGCGC 

5601 CTTCAGCCTG CACAGCTACA GCCCCGGCGA GATCAACCGC GTGGCCAGCT 

5651 GCCTGCGCAA GCTGGGCGTG CCCCCCCTGC GCGTGTGGCG CCACCGCGCC 

5101 CGCAGCGTGC GCGCCC6CCT GCTGAGCCAG GGCGGCCGCG CCGCCACCTG 

5751 CGGCAAGTAC CTGTTCAACT GGGCCGTGAA GACCAAGCTG AAGCTGACCC 

5801 CCATCCCCGC CGCCAGCCAG CTGGACCTGA GCGGCTGGTT CGTGGCCGGC 

5851 TACAGCGGCG GCGACATCTA CCACAGCCTG AGCCGCGCCC GCCCCCGCTG 

5901 GTTCATGCTG TGCCTGCTGC TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 

5951 TGCCCAACCG CTAAA 
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1 catcatcaat 
61 ttgtgacgtg 
121 gatgttgcaa 
181 gtgtgcgccg 
241 taaatttggg 
301 agtgaaatct 
361 gactttgacc 
421 cgggtcaaag 
481 catatcataa 
541 gattattgac 
601 tggagttccg 
661 cccgcccatt 
721 attgacgtca 
781 atcatatgcc 
841 atgcccagta 
901 tcgctattac 
961 actcacgggg 
1021 aaaatcaacg 
1081 gtaggcgtgt 
1141 cctggagacg 
1201 tccgcggccg 
1261 accatggcgc 
1321 actagcctta 
13 81 gcaacacaat 
1441 gctggctcaa 
1501 gaccaggacc 
1561 tgtggcagct 
1621 cggggcgaca 
1681 tcgggtggtc 
1741 tgcacccggg 
1801 atgcggtctc 
1861 gtggcccacc 
1921 gcagcccaag 
1981 ggggcgtata 
2041 attaccacag 
2101 tgctctgggg 
2161 acaatcttgg 
2221 gtgctcgcca 
2281 gtggccctgt 
2341 atcagggggg 
2401 gcaaagctgt 
2461 gtcataccaa 
2521 acgggcgact 
2581 agcttggatc 
2641 tcgcagcggc 
2701 ggagaacggc 
2761 tgtgcttggt 
2821 acaccagggt 
2881 ctcacccaca 
2941 tacctggtag 
3001 gatcaaatgt 
3061 ctgtacaggc 
3121 atcatggcat 
3181 ggagtccttg 
3241 aggattatct 
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gtgtacacag 

cgtaaccgag 

gaataatttt 
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cgttacataa 

gacgtcaata 

atgggtggag 

aagtacgccc 

catgacctta 

catggtgatg 

atttccaagt 

ggactttcca 

acggtgggag 

ccatccacgc 

ggaacggtgc 

ccatcacggc 

caggccggga 

ccttcctggc 

agaccttagc 

tcgtcggctg 

cagaccttta 

gtagggggag 

cactgctctg 

gggttgcgaa 

cggtcttcac 

tacacgctcc 

ggtacaaggt 

tgtctaaggc 

gcgcccccgt 

gcgcttatga 

gcatcggcac 

ccgctacgcc 

ctaatactgg 

gaaggcatct 

caggcctcgg 

ctatcggaga 

ttgactcagt 

ccaccttcac 

ggggtaggac 

cctcgggcat 

acgagctcac 

tgcccgtttg 

tagatgcaca 

cataccaagc 

ggaagtgtct 

tgggagccgt 

gcatgtcggc 

cagctctggc 

tgtccgggag 



attttggatt 
tgggaacggg 
acacatgtaa 
gaagtgacaa 
taagatttgg 
gtgttactca 
agactcgccc 
attattatag 
tatattggct 
tagtaatcaa 
cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
tgttttgacc 
attggaacgc 
ctactcccaa 
caagaaccag 
gacctgcgtc 
cggcccaaag 
gcaggcgccc 
cttggtcacg 
cctgctctcc 
cccttcgggg 
ggcggtggac 
ggacaactca 
cactggcagc 
gctcgtcctc 
acacggtatt 
cacatactct 
catcataata 
agtcctggac 
tccgggatcg 
agagatcccc 
cattttctgt 
aatcaacgct 
cgtcgttgtc 
gatcgactgt 
cattgagacg 
tggcaggggt 
gttcgattcc 
ccccgccgag 
ccaggaccac 
cttcttgtcc 
cacggtgtgc 
catacggctg 
ccaaaatgag 
tgacctggag 
cgcgtattgc 
gccggctatt 
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ctggagttct 
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gtcaccctca 

gtcgtcacta 

ctgacaacag 
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tgataatgag 

tagtagtgtg 

tggcaaaagt 

gttttaggcg 
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cggctggagc 

cacacccaaa 

aagccatccc 

agaagtgcga 

accgggggct 

acgctctgat 

tcacccagac 

ctcaagacgc 

tctacaggtt 

gtgagtgcta 

ggttgcgggc 

gggagagtgt 

aggcaggaga 

aggccccacc 

tgcacgggcc 

cccaccccat 

gcacctgggt 

gcagtgtggt 

gggagtttct 



ggggtggagt 

gcggaagtgt 
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3361 gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag ; 
3421 gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac v 
3481 atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac: 
3541 cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa . 
3601 agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc - 
3661 gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 
3721 gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 
3781 gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 
3841 gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 
3901 cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 
3961 cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 
4021 actcagatcc tctccagcct: taccatcact cagctgctga aaaggctcca bcagtggatt 
4081 aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 
4141 tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 
4201 ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 
4261 atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 
4321 aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 
4381 tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 
4441 gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 
4501 atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 
4561 gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag . 
4621 gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 
4681 gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 
4741 acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 
4801 cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct . 
4861 gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 
4921 gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 
4981 gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 
5041 atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 
5101 gactacgtcc ctccggtggt gcacgggtgc ccgttgccac ctatcaaggc ccctccaata 
5l61 ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 
5221 gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg.. 
5281 gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg^ 
5341 tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 
5401 tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 
5461 tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 
5521 ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 
5581 ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 
5641 gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 
5701 gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 
5761 aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 
5821 ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 
5881 tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 
5941 ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 
6001 gtgatgggct cctcatacgg actccagtac tctcctgggc agcgagtcga gttcctggtg 
6061 aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 
6121 tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 
6181 gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 
6241 ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 
6301 acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 
6361 gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 
6421 agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 
6481 tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 
6541 tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 
6601 cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 
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6661 aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 
6721 atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 
6781 cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat cattgaacga 
6841 ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 
6901 gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 
6961 agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 
7021 ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 
7081 gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 
7141 cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 
7201 tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 
7261 ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 
7321 ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 
7381 ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 
7441 ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 
7501 ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 
7561 gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 
7621 atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 
7681 gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 
7741 actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 
7801 tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 
7861 aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 
7921 cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 
7981 gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 
8041 tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 
8101 ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 
8161 atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 
8221 gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 
8281 ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 
8341 agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 
8401 gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 
8461 tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 
8521 gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 
8581 ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 
8641 aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 
8701 ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct 
8761 ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 
8821 gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 
8881 gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 
8941 ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 
9001 ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 
9061 aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 
9121 agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 
9181 cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 
9241 gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 
9301 tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 
93 61 tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 
9421 catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 
9481 cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 
9541 ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 
9601 tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 
9661 tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 
9721 cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 
9781 actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 
9841 aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 
9901 cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 
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9961 ttcctgaagg ggggctat.aa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 
10021 cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 
10081 ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 
10141 tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 
10201 gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgratg gagcgcaggg 
10261 tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 
10321 gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 
10381 gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 
. 10441 gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 
. 10501 gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 
10561 cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 
10621 gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 
10681 cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag; 
10741 ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 
10801 cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 
10861 tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 
10921 ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 
10981 gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 
11041 acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 
11101 tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 
11161 actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 
il221 cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 
11281 actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 
11341 gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 
11401 cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 
11461 ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 
11521 gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 
11581 gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 
11641 aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaa^g 
11701 tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 
11761 cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 
11821 gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 
11881 ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 
11941 agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 
12001 gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 
12061 agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca 
12121 caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta 
12181 cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 
12241 ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 
12301 catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 
12361 gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 
12421 tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaergccg catccccgcg 
12481 gcgcgactac ggtaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 
12541 ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 
12601 agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 
12661 gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 
12721 gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 
12781 gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 
12841 ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt 
12901 ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 
12961 gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 
13021 cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 
13081 gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 
13141 cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 
13201 ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 
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1656i gacgcgctgc ttcagcgcgt ggctcgttac aacagcagca acgtgcagac caacctggac 
16621 cggctggtgg gggatgtgcg cgaggccgtg gcgcagcgtg agcgcgcgca gcagcagggc 
16681 aacctgggct ccatggttgc actaaacgcc ttcctgagta cacagcccgc caacgtgccg 
16741 cggggacagg aggactacac caactttgtg agcgcactgc ggctaatggt gactgagaca 
16801 ccgcaaagtg aggtgtatca gtccgggcca gactattttt tccagaccag tagacaaggc 
16861 ctgcagaccg taaacctgag ccaggctttc aagaacttgc aggggctgtg gggggtgcgg 
16921 gctcccacag gcgaccgcgc gaccgtgtct agcttgctga cgcccaactc gcgcctgttg 
16981 ctgctgctaa tagcgccctt cacggacagt ggcagcgtgt cccgggacac atacctaggt 
17041 cacttgctga cactgtaccg cgaggccata ggtcaggcgc atgtggacga gcatactttc 
17101 caggagatta caagtgttag ccgcgcgctg gggcaggagg acacgggcag cctggaggca 
17161 accctgaact acctgctgac caaccggcgg caaaaaatcc cctcgttgca cagtttaaac 
17221 agcgaggagg agcgcatttt gcgctatgtg cagcagagcg tgagccttaa cctgatgcgc 
17281 gacggggtaa cgcccagcgt ggcgctggac atgaccgcgc gcaacatgga accgggcatg . 
17341 tatgcctcaa accggccgtt tatcaatcgc ctaatggact acttgcatcg cgcggccgcc 
17401 gtgaaccccg agtatttcac caatgccatc ttgaacccgc actggctacc gccccctggt 
17461 ttctacaccg ggggattcga ggtgcccgag ggtaacgatg gattcctctg ggacgacata 
17521 gacgacagcg tgttttcccc gcaaccgcag accctgctag agttgcaaca acgcgagcag 
17581 gcagaggcgg cgctgcgaaa ggaaagcttc cgcaggccaa gcagcttgtc cgatctaggc 
17641 gctgcggccc cgcggtcaga tgctagtagc ccatttccaa gcttgatagg gtctcttacc 
17701 agcactcgca ccacccgccc gcgcctgctg ggcgaggagg agtacctaaa caactcgctg 
17761 ctgcagccgc agcgcgaaaa gaacctgcct ccggcgtttc ccaacaacgg gatagagagc 
17821 ctagtggaca agatgagtag atggaagacg tatgcgcagg agcacaggga tgtgcccggc 
178.81 ccgcgcccgc ccacccgtcg tcaaaggcac gaccgtcagc ggggtctggt gtgggaggac 
17 941 gatgactcgg cagacgacag cagcgtcttg gatttgggag ggagtggcaa cccgtttgca 
18001 caccttcgcc ccaggctggg gagaatgttt taaaaaaaag catgatgcaa aataaaaaac 
18061 tcaccaaggc catggcaccg agcgttggtt ttcttgtatt ccccttagta tgcggcgcgc 
18121 ggcgatgtat gaggaaggtc ctcctccctc ctacgagagc gtggtgagcg cggcgccagt 
18181 ggcggcggcg ctgggttcac ccttcgatgc tcccctggac ccgccgttcg tgcctccgcg 
18241 gtacctgcgg cctaccgggg ggagaaacag catccgttac tctgagttgg cacccctatt 
18301 cgacaccacc cgtgtgtacc ttgtggacaa caagtcaacg gatgtggcat ccctgaacta 
18361 ccagaacgac cacagcaact ttctaaccac ggtcattcaa aacaatgact acagcccggg 
18421 ggaggcaagc acacagacca tcaatcttga cgaccggtcg cactggggcg gcgacctgaa 
18481 aaccatcctg cataccaaca tgccaaatgt gaacgagttc atgtttacca ataagtttaa 
18541 ggcgcgggtg atggtgtcgc gctcgcttac taaggacaaa caggtggagc tgaaatacga 
18601 gtgggtggag ttcacgctgc ccgagggcaa ctactccgag accatgacca tagaccttat 
18661 gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 
18721 cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 
18781 tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 
18841 aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 
18901 gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc 
18961 cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 
19021 gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 
19081 agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 
19141 tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 
19201 cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 
19261 cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 
19321 ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 
19381 atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 
19441 gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 
19501 ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 
19561 ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 
19621 gaaccagatt ttggcgcgcc cgccagcccc caeca tcacc accgtcagtg aaaacgttcc 
19681 tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 
19741 gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt 
19801 ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 
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19861 cagcaataac acaggctggg gcctgcgctt 
19921 gcgctccgac caacacccag tgcgcgtgcg 
19981 caaacgcggc cgcactgggc gcaccaccgt 
20041 ggcgcgcaac tacacgccca cgccgccgcc 
20101 cgtggtgcgc ggagcccggc gctacgctaa 
20161 tcgccaccgc cgccgacccg gcactgccgc 
20221 cgcacgtcgc accggccgac gggcggccat 
20281 tgtcactgtg ccccccaggt ccaggcgacg 
20341 tgctatgact cagggtcgca ggggcaacgt 
20401 gcgcgtgccc gtgcgcaccc gccccccgcg 
20461 ctcgtactgt tgtatgtatc cagcggcggc 
20521 aatcaaagaa gagatgctcc aggtcatcgc 
20581 agagcaggat tacaagcccc gaaagctaaa 
2 0641 tgatgatgaa cttgacgacg aggtggaact 
2 0701 acagtggaaa ggtcgacgcg taagacgtgt 
20761 gcccggtgag cgctccaccc gcacctacaa 
20821 ggacctgctt gagcaggcca acgagcgcct 
20881 ggacatgctg gcgttgccgc tggacgaggg 
20941 actgcagcag gtgctgcccg cgcttgcacc 
21001 gtctggtgac ttggcaccca ccgtgcagct 
21061 tgtcttggaa aaaatgaccg tggagcctgg 
21121 caagcaggtg gcaccgggac tgggcgtgca 
21181 tagcactagt attgccactg ccacagaggg 
21241 ggcggtggca gatgccgcgg tgcaggcggc 
21301 ggtgcaaacg gacccgtgga tgtttcgtgt 
21361 gaagtacggc gccgccagcg cgctactgcc 
21421 tacccccggc tatcgtggct acacctaccg 
21481 aaccaccact ggaacccgcc gccgccgtcg 
21541 cgtgcgcagg gtggctcgcg aaggaggcag 
21601 ccccagcatc gtttaaaagc cggtctttgt 
21661 cctccgtttc ccggtgccgg gattccgagg 
21721 ccacggcctg acgggcggca tgcgtcgtgc 
21781 tcgcatgcgc ggcggtatcc tgcccctcct 
21841 cgtgcccgga attgcatccg tggccttgca 
21901 catgtggaaa aatcaaaata aaagtctgga 
21961 ttgtagaatg gaagacatca actttgcgtc 
22021 catgggaaac tggcaagata tcggcaccag 
22081 ctcgctgtgg agcggcatta aaaatttcgg 
22141 ctggaacagc agcacaggcc agatgctgag 
22201 aaaggtggta gatggcctgg cctctggcat 
22261 agtgcaaaat aagattaaca gtaagcttga 
22321 ggccgtggag acagtgtctc cagaggggcg 
22381 agaaactctg gtgacgcaaa tagacgagcc 
22441 cctgcccacc acccgtccca tcgcgcccat 
22501 cgtaacgctg gacctgcctc cccccgccga 
22561 gtccgccgtt gttgtaaccc gtcctagccg 
22621 gcgatcgttg cggcccgtag ccagtggcaa 
22681 tttgggggtg caatccctga agcgccgacg 
22741 tgtcatgtat gcgtccatgt cgccgccaga 
22801 ccaagatggc taccccttcg atgatgccgc 
22861 acgcctcgga gtacctgagc cccgggctgg 
22921 tcagcctgaa taacaagttt agaaacccca 
22981 accggtctca gcgtttgacg ctgcggttca 
23041 cgtacaaggc gcggttcacc ctagctgtgg 
23101 cgtactttga catccgcggc gtgctggaca 



cccaagcaag atgtttggcg gggccaagaa 
cgggcactac cgcgcgccct ggggcgcgca 
cgatgacgcc atcgacgcgg tggtggagga 
agtgtccacc gtggacgcgg ccattcagac 
aatgaagaga cggcggaggc gcgtagcacg 
ccaacgcgcg gcggcggccc tgcttaaccg 
gcgagccgct cgaaggctgg ccgcgggtat 
agcggccgcc gcagcagccg cggccattag 
gtactgggtg cgcgactcgg ttagcggcct 
caactagatt gcaataaaaa actacttaga 
ggcgcgcatc gaagctatgt ccaagcgcaa 
gccggagatc tatggccccc cgaagaagga 
gcgggtcaaa aagaaaaaga aagatgatga 
gttgcacgcg accgcgccca ggcgacgggt 
tttgcgaccc ggcaccaccg tagtctttac 
gcgcgtgtat gatgaggtgt acggcgacga 
cggggagttt gcctacggaa agcggcataa 
caacccaaca cctagcctaa agcccgtgac 
gtccgaagaa aagcgcggcc taiaagcgcga 
gatggtaccc aagcgtcagc gactggaaga 
gctggagccc gaggtccgcg tgcggccaat 
gaccgtggac gttcagatac ccaccaccag 
catggagaca caaacgtccc cggttgcctc 
cgctgcggcc gcgtccaaga cctctacgga 
ttcagccccc cggcgtccgc gccgttcaag 
cgaatatgcc ctacatcctt ccatcgcgcc 
ccccagaaga cgagcaacta cccgacgccg 
ccgtcgccag cccgtgctgg ccccgatttc 
gaccctggtg ctgccaacag cgcgctacca 
ggttcttgca gatatggccc tcacctgccg 
aagaatgcac cgtaggaggg gcatggccgg 
gcaccaccgg cggcggcgcg cgtcgcaccg 
tattccactg atcgccgcgg cgattggcgc 
ggcgcagaga cactgattaa aaacaagtta 
ctctcacgct cgcttggtcc tgtaactatt 
actggccccg cgacacggct cgcgcccgtt 
caatatgagc ggtggcgcct tcagctgggg 
ttccgccgtt aagaactatg gcagcaaagc 
ggacaagttg aaagagcaaa atttccaaca 
tagcggggtg gtggacctgg ccaaccaggc 
tccccgccct cccgtagagg agcctccacc 
tggcgaaaag cgtccgcgac ccgacaggga 
tccctcgtac gaggaggcac taaagcaagg 
ggctaccgga gtgctgggcc agcacacacc 
cacccagcag aaacctgtgc tgccaggccc 
cgcgtccctg cgccgcgccg ccagcggtcc 
ctggcaaagc acactgaaca gcatcgtggg 
atgcttctga tagctaacgt gtcgtatgtg 
ggagctgctg agccgccgcg cgcccgcttt 
agtggtctta catgcacatc tcgggccagg 
tgcagttcgc ccgcgccacc gagacgtact 
cggtggcgcc tacgcacgac gtgaccacag 
tccccgtgga ccgcgaggat actgcgtact 
gtgataaccg tgtgctagac atggcttcca 
ggggccctac ttttaagccc tactctggca 
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23161 ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg. 
23221 aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc 
23281 aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 
23341 gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg 
23401 ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 
23461 aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 
23521 tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta, 
23581 tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 
23641 caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 
23701 taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg 
23761 ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 
23821 acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc ' 
23881 aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 
23941 agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 
24001 ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 
24061 caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 
24121 caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag 
24181 aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc ; 
24241 tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 
24301 accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 
24361 tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 
24421 acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 
24481 tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 
24541 ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 
24601 atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 
24661 ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 
24721 ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 
247 81 ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 
24841 catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 
24901 ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 
24961 ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 
25021 ctgttagctggccgggcaac gaccgcctgc ttactcccaa tgaigtbtgag attaagcgct 
25081 cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 
25141 tgcagatgtt ggccaactac aatattggct accagggctt cta:cattcca gaaagctaca 
25201 aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 
25261 atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 
25321 tcgtaggcta cc tcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 
25381 acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 
25441 gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 
25501 tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 
25561 atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 
25621 tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 
25681 gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 
2 5741 tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 
25801 ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 
25861 cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 
25921 aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 
25981 ccagtttgag tacgagtcac tcctgcgceg tagcgccatt gcctcttccc ccgaccgctg 
26041 tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcet 
26101 attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 
26161 ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 
26221 gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgccct^ 
26281 cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 
26341 gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 
26401 tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 
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gctatgcgcc 
aggcacaacc 
caacgcgttt 
cgcgcgcgag 
cacgctggcc 
cagggcgaac 
tgagttgcac 
atacagcgcc 
agagaagaac 
cacgcagcac 
cacgatcttg 
atccatttca 
gccttcgatc 
gtaggttacc 
aaaggtcttg 
cttgcatacg 
atcgttatcc 
cgcagacacg 
ggactcttcc 
ccgccgcacc 
acccaccatt 
ggatggcggg 
caaatccgcc 
tgacgagtct 
Scggggaggc 
cgccgcaccg 
ttccttctcc 
cgcccccttt 
ccccgtcgag 
tgtaagcgaa 
cgacgcagag 
agatgtggga 
cgcgttgcaa 
acgccacctg 
caacccgcgc 
catctttttc 
caagcagctg 
gccaaaaatc 
agaaaacagc 
gcgcctagcc 
cctacccccc 
cctggagagg 
tgagcagctg 
gctaatgatg 
tgacccggag 
cgtgcgccag 
aattttgcac 
gcgccgcgac 
catgggcgtg 
aaagcaaaac 
ggcggacatt 
caccagtcaa 
gcccgccacc 
tccgccgctt 
cgacatcatg 



actggcaggg 
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tggcagcagt 
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atcttccccg 
agcatgttgc 
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gaagacgtga 



acacgttgcg 
gctcggtgaa 
gcgccgatat 
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tgtcggagat 
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atactggtgt 
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29761 gctgcaacct atgcaccccg caccgctccc tggtctgcaa ttcacaactg cttagcgaaa* 
29821 gtcaaattat cggtaccttt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc . 
29881 cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 
29941 aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 
30001 agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca : 
30061 aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 
30121 gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 
=30181 cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 
.30241 gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa^ 

, 30301 gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 
30361 tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatt; 
30421 gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga 
30481 tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag: 
' 30541 caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 
30 601 ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 
30661 gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 
30721 ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 
30781 tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct; 
30841 ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatttttc. ccactctgta 
30901 tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 
30961 gcgctccctc acccgcagct gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 
31021 ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 
31081 cgcGCtttct caaaltttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 
31141 agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 
31201 ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaataaacta 
31261 catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 
31321 aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 
31381 ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 
31441 agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggctttcg 
31501 tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 
31561 tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 
31621 gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ctctgcagac 
31681 ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt* 
31741 gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 
31801 tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 
31861 ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 
31921 cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca 
31981 cggcgtccgg ctcaccaccc aggtagagct tacacgcagc ctgattcggg aigtttaccaa 
32041 gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 
32101 tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 
32161 attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 
32221 ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 
32281 aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 
32341 cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgaicacg 

. 32401 gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 
32461 caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 
32521 ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 
32581 tcaaatacaa tcactgtttc tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 
32641 acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg 
32701 gtggtctctg acaacactct taccatgcaa tcacaagcac cgctaaccgt gcaagactca 
32761 aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 
32821 acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 

. 32881 cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 

• 32941 ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta 
33001 acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 
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ggcgcaatag 

gtggatagcg 
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tcctcaaacg 

ggagctatgg 

atgggcagca 

tgcagaattg 

caaattttgg 

actctaagca 

tcatcactgg 

tacacttatg 
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ttcgcagggc 

cgccaggaac 

taaccagcgt 

tgctcaaaaa 

gcagataaag 

acatgtctgc 
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ggtttgatac 

ccggtcctaa 
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taaacaatga 
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atctggcaac 

ccaaaaacta 

aacaattaac 

aaaaacaaaa 
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cagacttact 
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taacaaaaat 
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tgattatgac 
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tggcctctat 

aaaaggcctt 

tgaaacagac 

taataccaat 

agccataaca 

atccccaaat 

atgtggcagt 
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tatgtcaaat 
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tgttatgttt 

gtagtatagc 
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cccggctggc 
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gttgctcaac 

gcatcaggat 
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gcataaggcg 

agtaactgca 

caaagctcat 

ttaagtggcg 

aattcaccac 

tcctaaacca 

aacaatgaca 

caatgttggc 

gcgtcagaac 

agggaagacc 

gcggatgatc 

tactgtacgg 

gaacgccgga 

ctgcgtctcc 

ctcaaagcat 

gccctgataa 

tgcgagtcac 

caaaagatta 

ggcgtggtca 

ggcttccaaa 

gtgaatctcc 

ccaccttctc 

ctgctccaga 

ggttcctcac 

cgtaggtccc 

gccacttccc 

ggagctatgc 

tgcaaggtgc 

tcatgctcat 

tttctctcaa 

tttaaacatt 

ctacggccat 
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gccggcgtga 
ggtcatgtcc 
cagtgctaaa 
cattacagcc 
tgaaaaaccc 
ttccacagcg 
acaccactcg 
gcgagtatat 
aaaccgcacg 
cgtcacttcc 
cacatacaag 
cacgtcacaa 
attgatgatg 



ccgtaaaaaa 
ggagtcataa 
aagcgaccga 
cccataggag 
tcctgcctag 
gcagccataa 
acacggcacc 
ataggactaa 
cgaacctacg 
gttttcccac 
ttactccgcc 
actccacccc 
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tgtaagactc 
aatagcccgg 
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ctcattatca 
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gtcacagtgt 
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tattggcttc 
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tcaggttgat 
acccgcaggc 
gagaaaaaca 
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gtccacaaaa 
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aagaaaacta 
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acagctcctc V 

tcacatcggt 

gtagagacaa , 

cataaacacc 

catacagcgc^ 

tattaaaaaa 

caagtgcaga 

aacacccaga 

ttcctcaaat 

caattcccaa 

cgccccgcgc 

aaggtatatt 
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10 30 50 

ATGGCGCCCATCACGGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACT 
^ + + + + + ^+ 

MetAlaProIleThrAlaTyrSerGlnGlnThrArgGlyLeuLeuGlyCysIlelleThr 

10 20 

70 90 110 

AGCCTTACAGGCCGGGACAAGAACCAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCA 
. ^ ^ ^ ^ ^ 

SerLeuThrGlyArgAspLysAsnGlnValGluGlyGluValGlnValValSerThrAla 

30 40 

130 150 170 

ACACTATCCTTCCTGGCGACCTGCGTO^GGCGTGTGTTGGACC^^ 

""^ •- + + + + + 

ThrGlnSerPheZieuAlaThrCysValAsnGlyValCysTrpThrValTyrHisGlyAla 

50 60 

150 210 230 

GGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGATGTACACTAATGTGGAC 
^ ^ ^ ^ ^ ^ 

GlySerLysThrLeuAlaGlyProLysGlyProIleThrGlnMetTyrThrAsnValAsp 

70 80 

250 270 290 

CAGGACCTCGTCGKSCTGGCAGGCCSCCCCCCGGGGCGCGTTCCTTGACACCATGCAC^ 

+ ^ ^ ^ ^ ^ 

GlnAspDeuValGlyTrpGlnAlaProProGlyAlaArgSerLeuThrProCysThrCys 

90 100 

310 330 350 

GGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGG 
+ + + + + + 

GlySer SerAspLeuTyrLeuValThrArgHi sAlaAspVal I lePr oValArgArgArg 

110 120 

370 390 410 

GGCGACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCG 

+ + + + + 

GlyAspSerArgGlySerLeuLeuSerProArgProValSerTyrLeuLysGlySerSer 

130 140 
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430 450 470 

GGTGGix:CACTGCl^l«:CCTTCG , 

: + : — ^ + + ^+ . 

GlyGlyProIieuLeuCysProSerGlyHisAiaValGlyllePheArgAlaAlaValC^ 

150 160 

490 510 ^ 530 

ACCCGGC^GGTTGCGAAGGCGGTGGAGTTtGTGCC^ 

— + — + +^ — . — + ^---+ 

ThrArgGlyValAlaLysAiaValAspPheVaiProValGluSerMetGluThi^ 

170 180 

550 570 590 

CGGTCTCCGGTCTTCACGGACAAC 

ArgSerProValPheThrAsipAsnSerSerProProAlaValProGlri 

190 200 

610 630 650 

GCCGAGCTACACGCTCCCACTC . 

AlaHisLeuHisAlaProThrGlySerGlyLysSerThrLysValProAlaAlaT^ 

210 220 

670 690 710 

GCCCAAGGGTACAAG6TGCTCGTCCTCAATCC 

AlaGinGiyiyrLysValijeuValLeuAsnProSerValAla^ 

230 240 

730 750 770 

GCGTATATGTCTAAGGCACACGGtATTCACCCCAAGATCA 

AlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThrGlyValArgThrlle 

250 260 

790 810 830 

ACCACAGGCiGCCCCCGtCAGAT^ 

+ + ^ + + — ^ — + i — + 

ThrThr<SlyAlaProValThr^rS6rThrTyrGlyLy 

270 280 
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850 870 890 

TCTGGGGGCGCTTATGACATCyiTAATATGTGATGAGTGCCATTCAACTGACTCGACTAC^ 
+ + ^ ^ ^ 

SerGlyGlyAlaTyr Aspl 1 el 1 el leCysAspGluCysHi sSerThrAspSerThrThr 

290 300 

910 930 950 

ATCTTGGGCATCGGCACAGTCCTGGACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTG 

+ + r + + 

IleLeuGlylleGlyXhrValLeiiAspGlnAlaGluThrAlaGlyAlaArgLeuValVal 

310 320 

970 , 990 1010 

CTCGCCACCGCTACG§qTCCGGGATCG»TCACCGTGCCACACCCAAACATC 

+ + + + H + 

LeiiAlaThrAlaThr^roProGlySerValThrValProHisProAsnlleGluGluVal 

330 340 

1030 1050 1070 

GCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCATCCCCATTGAAGCCATC 

— + + + + + + 

AlaLeuSerAsnThrGXyGluIleProPheTyrGlyLysAlalleProIleGliiAlalle 

350 360 

1090 1110 1130 

AGGGGGGGAAGGCATGTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAG^ 
+ + + ^ ^ ^ 

ArgGlyGlyArgHisLeuIlePheCysHisSerLysLysLysCysAspGluLexaAlaAla 

370 380 

1150 1170 1190 

AAGCTGTCAGGCCTCGGAATCAACGCTGTGGCGTATTACCGGGGGGTCGATGTGTCCGTC 
- + + + + -+ + 

LysLeuSerGlyLeuGlylleAsnAlaValAlaTyrTyrArgGlyLeuAspValSerVal 

390 400 

1210 1230 1250 

ATACCAACTATCG6AGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACG 
- + + ^ ^ ^ ^ 

IleProThrlleGlyAspValValValValAlaThrAspAlaLeiiMetThrGlyTyrThr 

410 420 
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1270 1290 1310 

GGCGACTTCGACTCAGTCATGG^ 

— ^ + + + + 

GlyAspPheAspSerVal I leAspCysAsnThrCysValThrGlnThrValAspPheSer 

430 440 

1330 1350 1370 

TTGGATCCCACCTTCACCATTGAGACGACGACCGTGCCTCAAGAGGGAGTC 

LeiiAspPrdThr PheThr I ieGluThr'IlirThr 

450 460 

1390 1410 1430 

CAGCGGGGGGGTAGGACT^^ 

GlnArgArgGlyArgThrGlyArgGlyArgArgGlylle'iyrA^ 

470 480 

1450 1470 1490 

GAAGGGGCCtCGGGCATG^ITC 

GiijJ^gProSer^iyMetPheAspSer 

490 500 

1510 1530 1550 

GCifTGGTAGGAGCTGACGC^ 



510 520 

1570 1590 1610 

CeAGGGTTGCCGGTOTGGGAG^ 

PrbGlyLeuProValCysGlnAspHisLeuGluPheTrpGluSerValPheThrGlyLe^ 

530 540 

1630 1650 1670 

ACGCAGATAGATGCACAClTCTO 

ThrHislleAspAlaHisPheLeuSerGiriThrLysGlnAlaGlyAspAshPheP^ 

550 560 
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1690 1710 1730 

CTGGTAGCATACCAAGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGAT 
. + ^ ^ . ^ ^ ^ 

LeuValAlaTyrGlnAlaThrValCysAlaArgAlaGlnAlaProProProSerTrpAsp 

570 580 

1750 1770 1790 

CAAATGTGGAAGTGTCTCATACGGCTGAAACCTACGCTGCACGGGCCAACACCCTTGCTG 
* + + + + + — + 

GlxiMetTrpLysCyslieulleArgLeuLysProThrLeuHisGlyProThrProLeuLeu 

590 600 

1810 1830 1850 

TACAGGCTGGGAGCCGTCCAAAATGAGGTCACCCTCACCCACCCCATAACCAAATACATC 

+ ^ ^ ^ ^ _ ^ 

TyrArgLeuGlyAlaValGlnAsnGluValThrLeuThrHisProIleThrLysTyrlle 

610 620 

1870 1890 1910 

ATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTGGGTGCTGGTGGGCGGA 
+ + ^ ^ 

MetAlaCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeuValGlyGly 

630 640 

1930 1950 1970 

GTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTGGGTAGG 

+ + + + + . 4. 

ValLeuAlaAlalieuAlaAlaTyrCysLeuThrThrGlySerValVallleValGlyArg 

650 660 

1990 2010 2030 

ATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGTTTCTCTACCAGGAGTTC 

+ 4..^.. + ^ ^ 

IlelleLeuSerGlyArgProAlalleValProAspArgGluPheLeuTyrGlnGluPhe 

670 680 

2050 2070 2090 

GATGAAATGGAAGAGTGCGCCTCGCACCTCCCTTACATCGAGCAGGGAATGCAGCTCGCC 
+ + + + + + 

AspGluMetGluGluCysAlaSerHisLeuProTyrlleGluGlnGlyMetGlnLeuAla 

690 700 
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2110 2130 2150 

GA^CAAWCAAGGAGAAAGGGCTG^^ 

+ + -.^ + :+ : + 

GliiGlnPheLysGlnLysAlaLeuGlyLeiiLeuGlnThrAlaThrLysGlnAlaGliiAl^ 

710 720 

2170 2190 2210 
GCTGCTH^CCGTGGTGGAGTCCAAGTGGCGAGCCCTTGAGACATTCTGGGCGAA 
_ — + + + . — ^ 1+ ^ ^+ 

AlaAiaPrbValValGluSerLysTrpArgAlaLeuGluThrPheTarpAlaLysHisMet 

730 740 

2230 2250 2270 

TGGAATTTCATCAGCGG 

; + + - + + + + 

TrpAsnPheileSerGlylleGlnTyrLeuAlaGlyLeuSerThrLeuProGlyAsnPro 

750 760 

2290 2310 2330 

GCAAfAGCATGATlX5ATGGCATTCACAGCeTCTATCACC^ 

AlalieAlaSerLeuMetAlaPheThrAlaSerileThrSerProLeuThrThrGlnS 

770 780 

2350 2370 2390 
ACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCGCGCCC 
— -i,^^ — -^+-: — — , ^- — + ^ — , — + 

ThrLeuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeiiAlaProProSerAia 

790 800 

2410 2430 2450 

GCT^6GClTtCGTGGGCGCCGGCATCGCCGGTGCG<^ 

, + _ + + : + : + , _ + 

AlaSerAlaPheValGlyAlaGlylleAlaGlyAlaAlaValGlySerlleGlyLeuGly 

810 820 

2470 2490 2510 
AAGGIWTTCTGGACATTCTGGCGGGTTATGGAGCAGGAGTGGCCG^ 
+ . + + . + 

liysValrieuValAspIleLeiiAlaGlyTyrGlyAlaGlyValAlaGlyAlalieuValAla 

830 840 
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2530 2550 2570 

TTCAAGGTCATGAGCGGCGAGATGCCCTCCACCGAGGACCTGGTCAATCT^^ 

+ + + + + ,+ 

PheLysValMetSerGlyGluMetProSerThrGlxiAspLeuValAsnLeuLeuProAla 

850 860 

2590 2610 2630 

ATCCTCTCTCCTGGCC3CCCTGGTCGTCGGGGTCGTGTGTGCAGCAATACTGCGTCGACAC 

+ + .; ^ ^ ^ _^ 

IleLeuSerProGlyAlaLeuValValGlyValValCysAleAlalleLeuArgArgHis 

870 880 

2650 2670 2690 

GTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATAGCGTTCGCCTCGCTO 

+ ^ ^ ^ ^ ^ 

ValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeulleAlaPheAlaSerArg 

890 900 

2710 2730 2750 

GGTAATCATGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGCCGCAGCGCGTGTTACT 
+ + + + + + 

GlyAsnHisValSerProThrHisTyrValProGluSerAspAlaAlaAlaArgValThr 

910 920 

2770 2790 2810 

CA6ATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGGATTAAT 
. + + ^ ^ ^ 

GlnlletieuSerSerLeuThrlleThrGlnLeuLeuLysArgLeuHisGlnTrpIleAsn 

930 940 

2830 2850 2870 

GAAGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGC 
+ + + i + + 

GluAspCysSerThrProCysSerGlySerTrpLeuArgAspValTrpAspTrpIleCys 

950 960 

2890 2910 2930 
ACGGTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGGAGCTACCGGGA 
' — + + + + + + 

ThrValLeuThrAspPheLysThrTrpLeuGlnSerLysLeuLeuProGlnLeuProGly 

970 980 
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2950 2970 2990 

GTCCGT^TlTTCTCGtGCCAACGCGGGTACi^ 

; + + + + + + 

ValProPhePheSerCysGlnArgGlyTyxliysGlyValTrpArgGlyAspGlylleMet 

990 1000 

3010 3030 3050 
CAAACCACCTGCCCATGTGGAGCACAGATCACCGGACATGTCAAAAACGGTTCCATGAGG 
+ — + — . + + + 

GlnThirThrCysProCysGlyAlaGlnlleThrGlyHisValLysAsh^ 

1010 1020 

3070 3090 3110 

ATC6TCGGGCGTAAGACC 



1030 1040 . 

3130 3150 3170 

ACCAGG(k3CCGCTGCAGAGCCTCTCCAGCGGGAAAGTAT^ 

^---*"- + --^ + ; ; + + r— r-^ + + 

ThicThi^lyProCysThrProSerProAlaProAsnlVrSerArgAlaLeuTrpA^ 

1050 1060 

3190 3210 3230 

GGCdCTGAGGAGtAGGTGGAGGTGAG 

; + + ; + + : + -i i- + 

AlaAlaGluGluTyrValGluValThrArgValGlyAspPheHisTyrValThrGlyMet 

1070 1080 

3250 3270 3290 

AGCAGTCAGAAGCSTAAAGtGCCCATGGC 

+ ; + w + + + , + 

ThrThrAspAsnValLysCysProCysGlnValProAlaProGluPhePheThrGluVal 

1090 1100 

3310 3330 3350 

GACGGAGtGGGGTTGCACAGCSTACGCTCCGGGGTGCAG^ 

; + + . + + + : + 

AspGlyVa 1 ArgLeUHi sTVrgTyrAlaProAlaGys ArgProLeuIieuArgGluGluVal 

1110 1120 
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3370 3390 3410 

ACATTCCAGGTCGGGCTCAACCAATACCTGGTIX^GTCACAGCTACCAI^ 

+ + + + + + 

ThrPheGlnValGlyLeuAsnGlnTyrLeuValGlySerGlnLeuProCysGluProGlu 

1130 1140 

3430 3450 3470 

CCGGATGTAGCAGTGCTCACTTCCATGCTCACCGACCCCTCCCACATCACAGCAGAAACG 
+ + + + +- + 

ProAspValAlaValLeuThrSerMetLeuThrAspProSerHislleThrAlaGluThr 

1150 1160 

3490 3510 3530 

GCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGCTCTTCAGCTAGCCAG 

+ + + + + + 

AlaLysArgArgLeuAlaArgGlySerProProSerlieuAlaSerSerSerAlaSerGln 

1170 1180 

3550 3570 3590 

TTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGTCTCTCCGGACGCTGAC 
+ ^ ^ ^ ^ 

LeuSerAlaProSerLeuLy s AlaThrCysThrThrHi sHi sVal Ser Pr oAspAlaAsp 

1190 1200 

3610 3630 3650 

CTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGCGTGGAG 

+ + + + + + 

LeuIleGluAlaAsnLeuLeuTrpArgGlnGluMetGlyGlyAsnlleThrArgValGlu 

1210 1220 

3670 3690 3710 

TCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAG 
+ + ^ ^ ^ ^ 

SerGliiAsnLysValValValLeuAspSerPheAspProLeuArgAlaGluGliiAspGlu 

1230 1240 

3730 3750 3770 

AGGGAAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATG 

+ + + + + + 

ArgGluValSerValProAlaGluIleLeuArgLysSerLysLysPheProAlaAlaMet 

1250 1260 
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3790 3810 3830 

CCGAa*:TCGGCGCGCCG6GATT^^ 

: : + + .^_-.--. + + + + 

ProIleTacpAlaArgProAspTyrAsnProProIieiiLeuGluSerTrpL^ 

1270 1280 

3850 3870 3890 

TACGTGCCTCGGGTGGTCCACGGGTGCCCGT 

'.i + + — + — + 

TyrValPrdProValValHisGlyCysPrbLeuProProlleLysAlaPr 

1290 1300 

3910 3930 3950 

CCTCGACGGAGAAAGAGGACGGT^ 

— ^-+^ — ^ i.+ . + -+ 

PrbProArgArgLysArgThryalValLeuThrGluSerSerValSerSe 

1310 1320 

3970 3990 4010 

GAGGTCGGTACTAAGACGTTC^ 

GluLeuAlaThrLysThrPheGlySerSerGluSerSerAlaValAsp 

1330 1340 

4030 4050 4070 

ACCGKiGCttGCtGACCAGGCCTCCG^ 

ThrAlar*euProAspGlnAlaSerAspAspGlyAspLysGlySerAspVai6iuSer^ 

1350 1360 

4090 4110 4130 

tcc"tccaikx:gccccgttgagggggaagc 

SerSerMetProProLeuGiuGlyGluProGlyAspProAspLeuSerAspGlySerTrp 

1370 1380 

4150 4170 4190 

TCTACCGTGAGCGATCAAGCTAGTG^ 

i--^ + 1 + , + + + + 

SerThrvalSerGluGlxiAlaSerGluAspValValCysCysSerMetSerl^ 

1390 1400 
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. ^210 4230 4250 

ACAGGCGCCTTGATCACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATC^^ 

[ + -+ + + + 

ThrGlyAlaLeuIleThrProCysAlaAlaGluGluSerLysLeuProIleAsnAlaLeu 

1410 1420 

4270 4290 4310 

AGCAACTCTTTGCTGCGCCACCATAACATGGTTTATGCCACAACATCTCGCAGCGCAGGC 
^.._+ ^ ^ ^ ^ 

SerAsnSerLeuLeiiArgHisHisAsnMetValTyrAlaThrThrSerArgSerAlaGly 

1430 1440 

4330 4350 4370 

ct<3k:ggcagaagaaggtcacctttgacagactgcaagtcc 

+ ^ ^ ^ ^ 

LeuArgGlnLysLysValThrPheAspArgrieuGlnValLeiiAspAspHisTyrArgAsp 

1450 1460 

4390 4410 443Q 

GTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAACTCCTATCCGTAGAG 

+ h + P + + 

ValLe\iLysGluMetLysAlaLysAlaSerThrValLysAlaLysLeuLeuSerValGlu 

1470 1480 

4450 4470 4490 

GAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGGGCAA^ 
^ ^ ^ ^ _^ ^ ^ 

GluAlaCysLysLeuThrProProHisSerAlaLysSerLysPheGlyTyrGlyAlaLys 

1490 1500 

4510 4530 4550 

GACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCCGTGTGGAAGGACTTG 
+ + + + + + 

AspValArgAsnLeuSerSerLysAlaValAsnHisIleHisSerValTrpLysAspLeu 

1510 1520 

4570 4590 4610 

CTGGAAGAeACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGT 
+ ^ ^ ^ ^ ^ 

LeuGluAspThrValThrProIleAspThrThrlleMetAlaLysAsnGluValPheCys 

1530 1540 
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4630 4650 4670 

GTCCAAGCAGAGAAAGGAGGCCGTAAGGGAGGCGGCCTTATCGTATTCCCAGA 

: + + + + + + 

ValGlnProGluLysGlyGlyArgLysProAlciArgLeuIleValPhePrqAspLieuGly 

1550 1560 

4690 4710 4730 

GTCCGTGTATGCGAGAAGATGGCCCTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTG. 

+ + • + + ^-- + -: + 

ValArgValCysGluLysMefcAlaLeuTyrAspValValSerThrlieuPrpGlnValVal 

1570 1580 

4750 4770 4790 

ATGGGGTGCTCATAGGGATTCGAGTAGTGTCCTGGGGAGCGAGTGGAGTTGGT^ 

^+~' + + +- +---P — + 

MetGlySerSerTyxGlyPheGlnTyrSerProGlyGlnArgValGluPheLieuValAsn 

1590 1600 . 

4810 4830 4850 
ACCTGGAAATCAAAGAAAAACCCCATGGGCiWrcATATGACAGTCGGTCTTTC 
— + + + ^ + , 

ThrTrpLy s S erLy s Ly s AsnPr oMe tGly PheSerTSrr AspThr Ar gCy s PheAspS er 

.1610 1620 

4870 4890 4910 

ACGGTGAGGGAGAACGACATCCGTGTTCAGGAGTCAATTTAGCAATGl^ 

+ — : + + . + +. 

ThrValThrGliaAsnAspIleArgValGluGluSerlleTyrGlnGysCysAspLeuAla 

1630 1640 

4930 4950 4970 

GCGGAAGGCAGACAGGGCATAAAATGGGTGACAGAGCGGCTTTATATGGGGGGTCCTC 
— . + — ^ + + 

ProGluAlaArgGlnAlalleLysSerLeuThrGluArgLeuTyrlleGlyGlyProIieu 

1650 1660 

4990 5010 5030 

ACTAATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCCGCGCGAGGGGCGTGCT^ 

+ + + + + + . 

ThrAsnSerLysGlyGlnAsnCysGlyTyrArgArgGysArgAlaSerGlyValLeuThr 

1670 1680 
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5050 5070 5090 

ACTAGCTGCGGTAACACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTCTCGA 

+ + + + 

ThrSerCysGlyAsnThrLeuThrCysTyrLeuLysAlaSerAlaAlaCysArgAlaAla 

1690 1700 

5110 5130 5150 

AAGCTCCAGGACTGCACGATGCTCGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGC 
, — ^ ^ ^ ^ ^ 

LysLeuGlnAspCysThrMetLeuValAsnGlyAspAspLeuValVallleCysGluSer 

1710 1720 

5170 5190 5210 

GCGGGAACCCT^GAGGACGCGraCGAGCCTACGAGTCTTCACGGAGGCTATGACTAGGTAC 

+ . ^ ^ ^ ^ ^ 

AlaGlyThrGlnGliiAspAlaAlaSerLeuArgValPheThrGluAlaMetThrArgTyr 

1730 1740 

5230 5250 5270 

TCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGAGCTGATAACATCATGT 
+ ^ ^ _^ ^ ^ 

SerAlaProProGlyAspProProGlnProGluTyrAspLeuGluLeuIleThrSerCys 

1750 1760 

5290 5310 5330 

TCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTCACCCGT 
- + + + + + + 

SerSerAsnValSerValAlaHisAspAlaSerGlyLysArgValTyrTyrLeuThrArg 

1770 1780 

5350 5370 5390 

GATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAAC 
+ + ^ ^ ^ ^ 

AspProThirThrProLeuAlaArgAlaAlaTirpGluThrAlaArgHisThrProValAsn 

1790 1800 

5410 5430 5450 

TCCTGGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATG 
+ + + + + + 

SerTrpLeuGlyAsnllelleMetTyrAlaProThrLeuTrpAlaArgMetlleLeuMet 

1810 1820 
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5470 5490 5510 

ACTCTICTTCTTCTCCATCCTTCTAGCACAGGAGCAACTO 

+ + + - + _ r- + - + 

ThrHi s Phe PheSer I leLeuLeiuAlaGlnGluGliiLeuGliiLy s AlaLeuAspCy sGln 

1830 1840 

5530 5550 5570 
ATCTACGGGGCCTGTTACTCCATTGAGCCACTTGACCTACCTCAGATCATTGAACGACTG 
+ + + — r . — + + -r-^ — . + 

IleiyrGlyAlaCysTyrSerlleGluProLeuAspLeuProGlnllelleGluArgLe^^ 

1850 1860 

5590 5610 5630 

CATGGCCTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAGATCAATAGGG 

; + + + + + . + 

HisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGlyGluIleAsnArgValAla . 

1870 1880 

5650 5670 5690 
TCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATCGGGCCAGG 
. + — + — + 

SerCyisLeuArgLysLeuGlyValProProLeiiArgValTrpArgHisArgAleU^rgSer 

1890 1900 

5710 5730 . 5750 

GTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCO^CTTGTGGCAAGTACC^ 

: + , + + . + ^--+ : + 

ValArgAlaArgLeuLeuSerGlnGlyGlyArgAlaAlaThrCysGlyLysTyrLeuPhe 

1910 1920 

5770 5790 5810 

AACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGCTGCGTCCCAGCTGGAC 

^ + 4. . — +--•: + ^- + + 

AsnTrpAlaValLysThrLysLeuLysLeuThrProIleProAlaAlaSerGlnlieuAsp 

1930 1940 

5830 5850 5870 
TTGTCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTC^ 
+- + + + + + 

LeuSerGlyTrpPheValAlaGlyTyrSerGlyGlyAspIleTyrHisSerLeuSerArg 

1950 1960 
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5890 5910 5930 

GCCCGACCCCGCTGGTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTAC 



AlaArgProArgTrpPheMetLeuCysLeiiLeuLeuLeuSerValGlyValGlylleTyr 

1570 1980 

5950 5955 
CTGCTCCCCAACCGA (SEQ. ID. NO, 5) 



LexiLeuProAsnArg (SEQ. id, NO. 6) 
1985 
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i 


TCGCGGGTTT 


CGGTGATGAC 


GGTGAAAACC 


TCTGAGACAT 


GCAGCTGGCG 


51 


GAGACGGTCA CAGGTTGTCT GTAAGCGGAT GCGGGGAGCA GACAAGCCGG 


101 


TCAGiGGCGCG 


TCAGCGGGTG 


TTGGCGGGTG 


TGGGGGGTGG 


CTTAACTATG 


151 


CGGCATCAGA 


GCAGATTGTA 


GTGAGAGTGC 


AGCATATGGG 


GTGTGAAATA 


201 


CCGCACAGAT 


GCGTAAGGAG 


AAAATACCGC 


ATGAGATTGG 


CTATTGGGGA 


251 


Tf GCATACGT TGTATCCATA TCATAATATG TACATTTATA 


TTGGCTCATG 


301 




CCGCCATGTT 


GACATTGATT 


ATTGACTAGT 


TATTAATAGT 


351 




GGGGTCATTA 


GTTCATAGGC 


GATATATGGA 


GTTCCGCGTT 


401 




CGGTAAATGG 


GGCGGCTGGG 


TGAGCGCCGA: AGGACCGCGG 


451 

^ JL 


CCCATTGATG 


TCAATAATGA CGtATGTTCG 


PATAGTAAPG 

W A X X- Anw w 


PPAATAGGGA 


501 


X X X ^Vm^^X X \3 


ACGTCAATGG 


GTGGAGTATT 


T APGGT A A AP 


TGPPPAPTTG 

X V7wV^V»>A\^ X XV9 


HiJ JL. 




AAGTGTATGA 


TATGGGAAGT 


APGPPPPPTA 


TTGAPGTPAA 

X X vaA\«V9 X V>AA 


\jSIJm 


TGAPCSGTAAA 


TGGGCGGCGT 


GGCATTATGG 


PPA/lTAPATVZ 

\^ ^ AV9 X M\«./\X x9 


APPTTATGGG 

AVitV* X X AX vWTVar 


fin 


n\« XXX X XXw 


TTGGCAGTAC 


ATCTACGTAT 


TAGTPATPGP 


TATT APP AHfYS 

XAX XAV*\#AXW. 


701 


X V3.r\ X VtV. VsO X 


TTTGGCAGTA 


GATCAATGGG 




GGTTTG AP TP 

XXX VjA.^ X w 


75i 


ACGGGGATTT 


CGAAGTCTCC 


ACCCGATTGA 


PGTPAATGGG 

v>\jXVi<AXlLX www 


AGTTTGTTTT 

n\3 XXX w X X X X 


801 


GGPAPPAAAA 


TGAAGGGGAC 


TTTCGAAAAT 


GTPGTAAPAA 


PTPPGPPPPA 


851 


TTGACGCAAA 


TGGGCGGTAG 


GGGTGTACGG 


TGGGAGGTPT 

X V3\jr\3AVJw X \^ X 


ATATAAGPAG 

t\ X 4^ X J^A\9\» AlW 


901 


AGCTCGTTTA 


GTGAACCGTC 


AGATCGCCTG 


GAGAPGPPAT 

w Aw Av.> V7V^ V. A X 


PP APGPTGTT 

V^V«AVi.O\^ X X X 


951 


TTGACCTCCA 


TAGAAGACAC 


CGGGACCGAT 




PGGPPGGGAA 


1001 


CGGTGCATTG 


GAACGCGGAT 


TCGGGGTGGG 


A AG AGTG APG 
AAV? AV9 X wA.V_ w 


TA AGT APPGP 
X AAV9 X Aw^Ur^ 


1051 


CTATAGAPTP 

w X C%X f%V3c^\v X V« 


TATAGGCAGA GGGCTTTQGG 


TPTT ATGP AT 

XV* X X AXVWrtAX 


GP T AT APTGT 

V3V- X A X A\h> X w X 


1101 


TTTTGGPTTG 

X X X X \7w\_« X X Vj 


GGGCCTATAC 


ACCCGGGGTT 


PP'T**r ATGP T* A 

V,.V-» X X AX X A 


TAGGTGATGG 


1151 


TATAGCTTAG 


GCTATAGGTG 


TGGGTTATTG 


ACGATTATTG 


ACCAGTCCCG 


1201 


TATTGGTGAP 

X X% X X WW X 


GATACTTTGC 


ATTACTAATC 


GATAACATGG 


GTCTTTGGGA 


1251 


CAAPTATPTP 


TATTGGCTAT ATGCGAATAG 


TCTGTCCTTC 


AGAGACTGAG 


1301 


At- GGAC^TuTG 


TATTTTTACA 


GGATGGGGTG 


CGATTTATTA 


TTTAGAAATT 


1351 


CACATATACA 


ACAACGCCGT 


CGCCCGTGCC 


CGCAGTTTTT 


ATTAAACATA 


1401 


GCGTGGGATC 


TCCACGCGAA 


TGTGGGGTAC 


GTGTTCCGGA 


CATGGGCTCT 


1451 


TCTCCGGTAG 


GGGCGGAGGT 


TGGACATGCG 


AGGGGTGGTC 


CCATGGCTGG 


1501 


AGCGGCTCAT 


GGTCGGTCGG 


CAGCTGCTTG 


CTCGTAAGAG 


TGGAGGGCAG 


1551 


ACTTAGGCAC 


AGGACAATGC 


CCACCAGGAC 


CAGTGTGCGG 


CAGAAGGCCG 




TGGCGGTAGG 


GTATGTGTCT 


GAAAATGAGG 


GTGGAGATTG 


GGGTGGGAGG 


1651 


GCTGACGCAG 


ATGGAAGACT 


TAAGGCAGCG 


GGAGAAGAAG 


ATGCAGGCAG 


1701 


CTGAGTTGTT 


GTATTCTGAT 


AAGAGTGAGA 


GGTAACTGGC 


GTTGGGGTGG 


1751 


TGTTAACGGT 


GGAGGGCAGT 


GTAGTCTGAG 


GAGTAGTGGT 


TGCTGGGGGG 


1801 


CGCGCCACCA 


GACATAATAG 


CTGAGAGACT 


AAGAGAGTGT 


TGCTTTGCAT 


1851 


GGGTCTTTTC 


TGCAGTGACG 


GTCGTTAGAT 


GTAGGTACCA 


GATATGAGAA 


1901 


TTCAGTCGAC 


AGGGGGGGCG 


ATCTGGTGTG 


CCTTCTAGTT 


GCCAGCCATC 


1951 


TGTTGTTTGC 


GGGTGGGGGG 


TGGGTTCCTT 


GACGGTGGAA 


GGTGCGAGTG 


2001 


CCACTGTCCT 


TTCCTAATAA 


AATGAGGAAA 


TTGCATGGCA 


TTGTCTGAGT 


2051 


AGGTGTCATT 


CTATTCTGGG 


GGGTGGGGTG GGGGAGGAGA GGAAGGGGGA 
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2101 GGATTGGGAA GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG 
2151 CCGCTGCGGC CAGGTGCTGA AGAATTGACC CGGTTCCTCG TGGGCCAGAA 
2201 AGAAGCAGGC ACATCCCCTT CTCTGTGACA CACCCTGTCC ACGCCCCTGG 
2251 TTCTTAGTTC CAGCCCCACT CATAGGACAC TCATAGCTCA GGAGGGCTCC 
2301 GCCTTCAATC CCACCCGCTA AAGTACTTGG AGCGGTCTCT CCCTCCCTCA 
2351 TCAGCCCACC AAACCAAACC TAGCCTCCAA GAGTGGGAAG AAATTAAAGC 
2401 AAGATAGGCT ATTAAGTGCA GAGGGAGAGA AAATGCCTCC AACATGTGAG 
2451 GAAGTAATGA GAGAAATCAT AGAATTTCTT CCGCTTCCTC GCTCACTGAC 
2501 TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG CTCACTCAAA 
2551 GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA 
2601 TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG 
2651 CTGGCGTTTT TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG 
2701 ACGCTCAAGT CAGAGGTGGC GAAACCCGAC AGGACTATAA AGATACCAGG 
2751 CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG 
2801 CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC 
2851 TCATAGCTCA CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA 
2901 AGCTGGGCTG TGTGCACGAA CCCCCCGTTC AGCCCGACCG CTGCGCCTTA 
2951 TCCGGTAACT ATCGTCTTGA GTCCAACCCG GTAAGACACG ACTTATCGCC 
3001 ACTGGCAGCA GCCACTGGTA ACAGGATTAG CAGAGCGAGG TATGTAGGCG 
3051 GTGCTACAGA GTTCTTGAAG TGGTGGCCTA ACTACGGCTA CACTAGAAGA 
3101 ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT TCGGAAAAAG 
3151 AGTTGGTAGC TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT 
3201 TTTTTGTTTG CAAGCAGCAG ATTACGCGCA GAAAAAAAGG ATCTCAAGAA 
3251 GATCCTTTGA TCTTTTCTAC GGGGTCTGAC GCTCAGTGGA ACGAAAACTC 
3301 ACGTTAAGGG ATTTTGGTCA TGAGATTATC AAAAAGGATC TTCACCTAGA 
3351 TCCTTTTAAA TTAAAAATGA AGTTTTAAAT CAATCTAAAG TATATATGAG 
3401 TAAACTTGGT CTGACAGTTA CCAATGCTTA ATCAGTGAGG CACCTATCTC 
3451 AGCGATCTGT CTATTTCGTT CATCCATAGT TGCCTGACTC GGGGGGGGGG 
3501 GGCGCTGAGG TCTGCCTCGT GAAGAAGGTG TTGCTGACTC ATACCAGGCC 
3551 TGAATCGCCC CATCATCCAG CCAGAAAGTG AGGGAGCCAC GGTTGATGAG 
3 601 AGCTTTGTTG TAGGTGGACC AGTTGGTGAT TTTGAACTTT TGCTTTGCCA 
3 651 CGGAACGGTC TGCGTTGTCG GGAAGATGCG TGATCTGATC CTTCAACTCA 
3701 GCAAAAGTTC GATTTATTCA ACAAAGCCGC CGTCCCGTCA AGTCAGCGTA 
3751 ATGCTCTGCC AGTGTTACAA CCAATTAACC AATTCTGATT AGAAAAACTC 
3 801 ATCGAGCATC AAATGAAACT GCAATTTATT CATATCAGGA TTATCAATAC 
3851 CATATTTTTG AAAAAGCCGT TTCTGTAATG AAGGAGAAAA CTCACCGAGG 
3901 CAGTTCCATA GGATGGCAAG ATCCTGGTAT CGGTCTGCGA TTCCGACTCG 
3951 TCCAACATCA ATACAACCTA TTAATTTCCC CTCGTCAAAA ATAAGGTTAT 
4001 CAAGTGAGAA ATCACCATGA GTGACGACTG AATCCGGTGA GAATGGCAAA 
4051 AGCTTATGCA TTTCTTTCCA GACTTGTTCA ACAGGCCAGC CATTACGCTC 
4101 GTCATCAAAA TCACTCGCAT CAACCAAACC GTTATTCATT CGTGATTGCG 
4151 CCTGAGCGAG ACGAAATACG CGATCGCTGT TAAAAGGACA ATTACAAACA 
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4201 GGAATCGAAT GCAACCGGCG CAGGAACACT GCCAGCGCAT CAACAATATT 

4251 TTCACCTGAA TGAGGATATt CTTCTAATAC CTGGAATGCT GTTTTCCCGG 

4301 GGATCGCAGT GGTGAGTAAC CATGGATGAT CAGGAGTAGG GATAAAATGC 

4351 TTGATGGTCG GAAGAGGCAT AAATTCCGTC AGCCAGTTTA GTCTGACCAT 

4401 CtCATCTGTA ACATCATTGG CAAGGGTACC TTTCSCGATGT TTCAGAAACA 

4451 ACtCTGGCGG ATCGGGCTTC CCATACAATC GATAGATTGT CGCACCTGAT 

4501 TGCCGGACAT TATCGGGAGC GCATTTATAC CCATATAAAT CAGGATCCAT 

4551 GTTGGAATTt AATCGGGGGG TCGAGGAAGA CGTTTCCGGT TGAATATGGG 

4601 tCATAACAGd CCTTGTATTA CTGTTTATGT AAGCAGAGAG TTTTATTGTT 

4651 CAfiSATGAtA TATTTTTATC TTGTGGAATG TAACATGAGA GATTTTGAGA 

4701 GACAAGGtGG CTTTCGGGCC CGCCCCATTA TTGAAGCATT TATCAGGGTT 
4751 . ATTGTGfCAT GAGCGGATAG ATATTTGAAT GTATTTAGAA AAATAAACAA 

4801 ATAGGGGTTC CGCGCACATT TGGCCGAAAA GTGCCACCTG AGGTCTAAGA 

4851 AACCATTATT ATGATGAGAT TTU^CGTATAA AAATAGGCGT ATGAGGAGGC 

4S01 CCTTTCGtC " 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGG6CG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 
121 GATGTTGTAA GTGTGGCGGA ACACATGTAA GCGCCGGATG TGGTAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACGG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCAAG TAATATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA 
3 01 AGTGAAATCT GAATAATTCT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
3 61 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG CGCAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACGAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GAGTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAT TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA : 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTGCCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG. 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAAATTATGG GCAGTGGGTG 
1141 ATAGAGTGGT GGGTTTGGTG TGGTAATTTT TTTTTTAATT TTTACAGTTT TGTGGTTTAA 
1201 AGAATTTTGT ATTGTGATTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
12 61 CCAGAACCGG AGCCTGCAAG ACCTACCCGG CGTCCTAAAT TGGTGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGT6GGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 TCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCCTCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT GGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCAA TAATACCGAC GGAGGAGCAA 
2161 CAGCAGGAGG AAGCCAGGCG GCGGCGGCGG CAGGAGCAGA GCCCATGGAA CCCGAGAGCC 
2221 6GCCTGGACC CTCGGGAATG AATGTTGTAC AGGTGGCTGA ACTGTTTCCA GAACTGAGAC 
2281 GCATTTTAAC CATTAACGAG GATGGGCAGG GGCTAT^GGG GGTAAAGAAG GAGCGGGGGG 
2341 CTTCTGAGGC TACAGAGGAG GCTAGGAATC TAACTTTTAG CTTAATGACC AGACACCGTC 
2401 CTGAGTGTGT TACTTTTCAG CAGATTAAGG ATAATTGCGC TAATGAGCTT GATCTGCTGG 
2461 CGCAGAAGTA TTCCATAGAG CAGCTGACCA CTTACTG6CT GCAGCCAGGG GATGATTTTG 
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2521 AGGAGGCTAT TAGGGTATAT GCAAAGGTGG CACTTAGGCC AGATTGCAAG TACAAGATTAi 
2581 GCAAACTTGT AAATATCAGG AATTGTTGCT ACATTTCTGG GAACGGGGCC GAGGTGGAGA 
2641 TAGAfACGGA GGATAGGGTG GGCTTTAGAT GTAGCATGAT AAATATGTGG CCGGGGGTGC - 

N 2701 TTGGCATGGA CGGGGTGGTT ATTATGAATG TGAGGTTTAC TGGTCCCAAT TTTAGCGGTA 

• 2761 CGGTTTTCCT GGCCAATACG AATCTTATCC TACACGGTGT AAGCTTCTAT GGGTTTAACA 
2821 AtACCTGTGT GGAAGCCTGG AGCGATGTAA GGGTTCGGGG CTGTGCCTTT TACTGCTC5CT . 
2881 GGAAGGGGGT GGTGTGTCGC CCCAAAAGCA GGGCTTCAAT TAAGAAATGG CTGTTTGAAA 
^941 GGTGTACCTT GGGTATCCTG TCTGAGGGTA ACTCCAGGGT GCGCCACAAT GTGGCCTCCG 
3001 ACTGTGGTTG CTTTATGCTA GTGAAAAGCG TGGCTGTGAT TAAGCATAAC ATGGTGTGTG \ 
3061 GCAACTGCGA GGACAGGGCC TCTCAGATGC TGACCTGCTC GGACGGCAAG TGTCACTTGC 
3121 TGAAGACCAT TCACGTAGCC AGCGACTCTC GCAAGGCCTG GCCAGTGTTT GAGCACAACA 
3181 TACTGACCCG CTGTTCCTTG CATTTGGGTA ACAGGAGGGG GGTGTTCCTA CCTTACCAAT 
3241 GCAATTTGAG TCACACTAAG ATATTGCTTG AGCCCGAGAG CATGTCCAAG GTGAACCTGA 
3301 ACGGGGTGTT TGACATGACC ATGAAGATCT GGAAGGTGCT GAGGTACGAT GAGACCCGCA. 
33 61 CCAGGTGCAG ACCCTGCGAG TGTGGCGGTA AACATATTAG GAACCAGCCT GTGATGCTGG 
^ 3421 ATGTGACCGA GGAGCTGAGG CCCGATCACT TGGTGCTGGC CTGCACCCGC GCTGAGTTTG 
3481 GCTCTAGCGA TGAAGATACA GATTGAGGTA CTGAAATGTG TGGGCGTGGC TTAAGGGTGG 

V 3541 GT^GAATAT ATAAGGTGGG GGTCTCATGT AGTTTTGTAT CTGTTTTGCA GCAGCCGCCG 
3 601 CCATGAGCGC CAACTCGTTT GATGGAAGCA TTGTGAGCTC ATATTTGACA ACGCGCATGC 
3661 CCCCATGGGC CGGGGTGCGT CAGAATGTGA TGGGCTCCAG CATTGATGGT CGCCCCGTCC 
■ 3721 TGCCCGCAAA CTCTACTACC TTGACCTACG . AGACCGTGTC TGGAACGCCG TTGGAGACTG 
3781 CAGCCTCCGC CGCCGCTTCA GCCGCTGCAG CCACCGCCCG CGGGATTGTG ACTGACTTTG 
3841 CTTTCCTGAG CCCGCTTGCA AGCAGTGCAG CTTCCCGTTC ATCCGCCCGC GATGACAAGT 

'-■ 3901 TGACGGCTCT TTTGGCACAA TTGGATTCTT TdACCCGGGA ACTTAATGTC GTTTCTCAGG 
3961 AGCTGTTGGA TCTGCGCCAG CAGGTTTCTG CCCTGAAGGC TTCCTCCCCT CCCAATGCGG 
4021 TTTAAAACAT AAATAAAAAC CAGACTCTGT TTGGATTTGG ATCAAGCAAG TGTCTTGCTG 
4081 TCTTTATTTA GGGGTTTTGC GCGCGCGGTA GGCCCGGGAC CAGCGGTCTC GGTCGTTGAG 
4141 GGTCCTGTGT ATTTTTTCCA GGACGTGGTA AAGGTGACTC TGGATGTTCA GATACATGGG 
4201 CATAAGCCCG TCTCTGGGGT GGAGGTAGCA CCACTGCAGA GCTTCATGCT GCGGGGTGGT 
4261 GTTGTAGATG ATCCAGTCGT AGCAGGAGCG CTGGGCGTGG TGCCTAAAAA TGTCTTTCAG 
4321 TAGCAAGCTG ATTGCCAGGG GCAGGCCCTT GGTGTAAGTG TTTACAAAGC GGTTAAGCTG 
4381 GGATGGGTGC ATACGTGGGG ATATGAGATG CATCTTGGAC TGTATTTTTA GGTTGGCTAT 
4441 GTTCCCAGCC ATATCCCTCC GGGGATTCAT GTTGTGCAGA ACCACCAGCA CAGTGTATCC 
.4501 GGTGCACTTG GGAAATTTGT CATGTAGCTT AGAAGGAAAT GCGTGGAAGA ACTTGGAGAC 
4561 GCCCTTGTGA CCTCCAAGAT TTTCCATGCA TTCGTCCATA ATGATGGCAA TGGGCCCACG 
4621 GGCGGCGGCC TGGGCGAAGA TATTTCTGGG ATCACTAACG TCATAGTTGT GTTCCAGGAT 
4681 GAGATCGTCA TAGGCCATTT TTACAAAGCG CGGGCGGAGG GTGCCAGACT GCGGTATAAT 
4741 GGTTCCATCC GGCCCAGGGG CGTAGTTACC CTCACAGATT TGCATTTCCC ACGCTTTGAG 
4801 TTCAGATGGG GGGATCATGT CTACCTGCGG GGCGATGAAG AAAACCGTTT CCGGGGTAGG 
4861 GGAGATCAGC TGGGAAGAAA GCAGGTTCCT AAGCAGCTGC GACTTACCGC AGCCGGTGGG 
4921 CCCGTAAATC ACACCTATTA CCGGCTGCAA CTGGTAGTTA AGAGAGCTGC AGCTGCCGTC 
4981 ATCCCTGAGG AGGGGGGCCA CTTCGTTAAG CATGTCCCTG ACTTGCAtGT TTTCCCTGAC 
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5041 CAAATCCGCC AGAAGGCGCT CGCCGCCCAG 
5101 TTTCAACGGT TTGAGGCCGT CCGCCGTAGG 
5161 CAGGCGGTCC CACAGCTCGG TCACGTGCTC 
5221 TTTCGCGGGT TGGGGCGGCT TTCGCTGTAC 
5281 AGGGTCATGT CTTTCCACGG GCGCAGGGTC 
5341 GGGTGCGCTC CGGGTTGCGC GCTGGCCAGG 
5401 AAGCGCTGCC GGTCTTCGCC CTGCGCGTCG 
5461 TCCAGCCCCT CCGCGGCGTG GCCCTTGGCG 
5521 GAGGGGCAGT GCAGACTTTT AAGGGCGTAG 
5581 GAGTAGGCAT CCGCGCCGCA GGCCCCGCAG 
5641 TCTGGCCGTT CGGGGTCAAA AACCAGGTTT 
5701 CTGGTTTCCA TGAGCCGGTG TCCACGCTCG 
5761 ACAGACTT6A GAGGCCTGTC CTCGAGCGGT 
5821 GACCACTCTG AGACGAAGGC TCGCGTCCAG 
5881 TAGCGGTCGT TGTCCACTAG GGGGTCCACT 
5941 TCTTCGGCAT CAAGGAAGGT GATTGGTTTA 
6001 GAAGGGGGGC TATAAAAGGG GGTGGGGGCG 
6061 TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC 
6121 CTAAGATTGT CAGTTTCCAA AAACGAGGAG 
6181 CCTTTGAGGG TGGCCGCGTC CATCTGGTCA 
6241 GTGGCAAACG ACCCGTAGAG GGCGTTGGAC 
6301 TTTTTGTCGC GATCGGCGCG CTCCTTGGCC 
6361 ACGCACCGCC ATTCGGGAAA GACGGTGGTG 
6421 CCGCGGTTGT GCAGGGTGAC AAGGTCAACG 
6481 TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC 
6541 GTCTCGTCCG GGGGGTCTGC GTCCACGGTA 
6601 TAGTCTATCT TGCATCCTTG CAAGTCTAGC 
6661 CGCTCGTATG GGTTGAGTGG GGGACCCCAT 
6721 ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC 
6781 CATCTTCCAC CGCGGATGCT GGCGCGCACG 
6841 AGGTCGGGAC CGAGGTTGCT ACGGGCGGGC 
6901 ATGGCATGTG AGTTGGATGA TATGGTTGGA 
6961 AGACCTACCG CGTCACGCAC GAAGGAGGCG 
7021 GCGGTGACCT GCACGTCTAG GGCGCAGTAG 
7081 TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG 
7141 TACTCTTGGA TCGGAAACCC GTCGGCCTCC 
7201 TTGACGGCCT GGTAGGCGCA GCATCCCTTT 
7261 TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG 
7321 TATTTGAAGT CAGTGTCGTC GCATCCGCCC 
7381 TTGGAACGCG GGTTTGGCAG GGCGAAGGTG 
7441 GGCATAAAGT TGCGTGTGAT GCGGAAGGGT 
7501 TGGGCGGCGA GCACGATCTC GTCAAAGCCG 



CGATAGCAGT TCTTGCAAGG AAGCAAAGTT 
CATGCTTTTG AGCGTTTGAC CAAGCAGTTC 
TACGGCATCT CGATCCAGCA TATCTCCTCG 
GGCAGTAGTC GGTGCTCGTC CAGACGGGCC 
CTCGTCAGCG TAGTCTGGGT CACGGTGAAG 
GTGCGCTTGA GGCTGGTCCT GCTGGTGCTG 
GCCAGGTAGC ATTTGACCAT GGTGTCATAG 
CGCAGCTTGC CCTTGGAGGA GGCGCCGCAC 
AGCTTGGGCG CGAGAAATAC CGATTCCGGG 
ACGGTCTCGC ATTCCACGAG CCAGGTGAGC 
CCCCCATGCT TTTTGATGCG TTTCTTACCT 
GT6ACGAAAA GGCTGTCCGT GTCCCCGTAT 
GTTCCGCGGT CCTCCTCGTA TAGAAACTCG 
GCCAGCACGA AGGAGGCTAA GTCSGGAGGGG 
CGCTCCAGGG TGTGAAGACA CATGTCGCCC 
TAGGTGTAGG CCACGTGACC GGGTGTTCCT 
CGTTCGTCCT CACTCTCTTC CGCATCGCTG 
TCCCTCTCAA AAGCGGGCAT GACTTCTGCG 
GATTTGATAT TCACCTGGCC CGCGGTGATG 
GAAAAGACAA TCTTTTTGTT GTCAAGCTTG 
AGCAACTTGG CGATGGAGCG CAGGGTTTGG 
GCGATGTTTA GCTGCACGTA TTCGCGCGCA 
CGCTCGTCGG GCACTAGGTG CACGCGCCAA 
CTGGTGGCTA CCTCTCCGCG TAGGCGCTCG 
GAGCAGAATG GCGGTAGTGG GTCTAGCTGC 
AAGACCCCGG GCAGCAGGCG CGCGTCGAAG 
GCCTGCTGCC ATGCGCGGGC GGCAAGCGCG 
GGCATGGGGT GGGTGAGCGC GGAGGCGTAC 
TCTCTGAGTA TTCCAAGATA TGTAGGGTAG 
TAATCGTATA GTTCGTGCGA GGGAGCGAGG 
TGCTCTGCTC GGAAGACTAT CTGCCTGAAG 
CGCTGGAAGA CGTTGAAGCT GGCGTCTGTG 
TAGGAGTCGC GCAGCTTGTT GACCAGCTCG 
TCCAGGGTTT CCTTGATGAT GTCATACTTA 
TTGAGGACAA ACTCTTCGCG GTCTTTCCAG 
GAACGGTAAG AGCCTAGCAT GTAGAACTGG 
TCTACGGGTA GCGCGTATGC CTGCGCGGCC 
GTGTCCCTAA CCATGACTTT GAGGTACTGG 
TGCTCCCAGA GCAAAAAGTC CGTGCGCTTT 
ACATCGTTGA AGAGTATCTT TCCCGCGCGA 
CCCGGCACCT CGGAACGGTT GTTAATTACC 
TTGATGTTGT GGCCCACAAT GTAAAGTTCC 
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75i51 AAGAAGCGCG GGATGCCCTt GATGGAAGGC AATTTTTTAA GTTCCTCGTA GGTGAGfcTCT . 
7621 TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG GTTGGAAGCG 
7681 ACGAAtGAGC TCCACAGGTC ACGGGCCATT AGCATTTGCA GGTGGTCGCG AAAGGTCCTA; 
7741 AACTGGCGAC CTATGGCCAT TTTTTCTGGG . GTGATGC AGT AGAAGGTAAG CGGGTCTTGT' 
7801 TCCCAGCGGT CCCATCCAAG GTCCGCGGCT AGGTCTCGCG CGGCGGTCAC TAGAGGCTCA 
7861 tCTCCGCCGA ACTTCATGAC GAGCATGAAG GGCACGAGCT GCTTCCCAAA GGCCCCCATCr; 
7921 CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG ATGCGAGCCG.; 
7981 ATCGGGAAGA ACTGGATCTC CCGCCACCAG TTGGAGGAGT GGCTGTTGAT GTGGTGAAAG - 
8041 TAGAAGTCCC TGCGACGGGC GGAACACTCG TGCTGGCTTT TGTAAAAACG TGCGCAGTAd 
8101 TGGCAGCGGT GCACGGGCTG TACATCCTGC ACGAGGTTGA CCTGACGACC GCGCACAAGG 
8161 AAGCAGAGTG GGAATTTGAG CCCCTCGCCT GGCGGGTTTG GCTGGTGGTC TTCTACTTCG 
8221 iSCTGCTTGTC CTTGACCGTC TGGCTGCTCG AGGGGAGTTA CGGTGGATCG GACCACCACG 
8281 CCGCGCGAGC CCAAAGTCCA GATGTCCGCG CGCGGCGGTC GGAGCTTGAT GACAACATCG 
8341 CGCAGATGGG AGCTGTCCAT GGTCtGGAGC TCCCGCGGCG TCAGGTCAGG CGGGAGCTCC 
8401 TGCAGGTTTA CCTCGCATAG CCGGGTCAGG GCGCGGGICTA GGTCCAGGTG ATACCTGATT 
8461 TCCAGGGGCT GGTTGGTGGC GGCGTCGATG GCTTGCAAGA GGCCGCATCC CCGCGGCGCG 
8521 ACTACGGTAC CGCGCGGCGG GCGGTGGGCC GCGGGGGTGT CCTTGGATGA TGCATCTAAA 
8581 AGCGGTGACG CGGGCGGGCC CCCGGAGGTA GGGGGGGCTC GGGACCCGCC GGGAGAGGGG 
8641 GCAGGGGCAC GTCGGCGCCG CGCGCGGGCA GGAGCTGGTG CTGCGCGCGG AGGTTGCTGG 
87 01 CGAACGCGAC GACGCGGCGG TTGATCTCCT GAATCTGGCG CCTCTGCGTG AAGACGACGG 
87 61 GCCCGGTGAG CTTGAACCTG JVAAGAGAGTT CGACAGAATC AATTTCGGTG TCGTTGACGG 
8821 CGGCCTGGCG CAAAATCTCC TGCACGTCTC CTGAGTTGTC TTGATAGGCG ATCTCGGCCA 
8881 TGAACTGCTC GATCTCTTCC TCCTGGAGAT CTCCGCGTCC GGCTCGCTCC ACGGTGGCGG 
8941 CGAGGTCGTT GGAGATGCGG GCCATGAGCT GCGAGAAGGC GTTGAGGCCT CCCTCGTTCC 
9001 AGACGCGGCT GTAGACCACG CCCCCTTCGG CATCGCGGGC GCGCATGACC ACCTGCGCGA 
9061 GATTGAGCTC CACGTGCCGG GCGAAGACGG CGTAGTTTCG CAGGCGCTGA AAGAGGTAGT 
9121 TGAGGGTGGT GGCGGTGTGT TCTGCCACGA AGAAGTACAT AACCCAGCGC CGCAACGTGG 
9181 ATTCGTTGAT ATCCCCCAAG GCCTCAAGGC GCTCCATGGC CTCGTAGAAG TCCACGGCGA 
9241 AGTTGAAAAA CTGGGAGTTG CGCGCCGACA CGGTTAACTC CTCCTCCAGA AGACGGATGA 
9301 GCTCGGCGAC AGTGTCGCGC ACCTCGCGCT CAAAGGCTAC AGGGGCCTCT TCTTCTTCTT 
9361 CAATCTCCTC TTCCATAAGG GCCTCCCCTT CTTCTTCTTC TGGCGGCGGT GGGGGAGGGG 
9421 GGACACGGCG GCGACGACGG CGCACCGGGA GGCGGTCGAC AAAGCGCTCG ATCATCTCCC 
9481 CGCGGCGACG GGGCATGGTC TCGGTGACGG CGCGGCCGTT CTCGCGGGGG CGCAGTTGGA 
9541 AGACGCCGCC CGTCATGTCC CGGTTATGGG TTGGCGGGGG GCTGCCGTGC GGCAGGGATA 
9601 CGGCGCTAAC GATGCATCTC AACAATTGTT GTGTAGGTAC TCCGCCACCG AGGGACCTGA 
9661 GCGAGTCCGC ATCGACCGGA TCGGAAAACC TCTCGAGAAA GGCGTCTAAC CAGTCACAGT 
' 9721 CGCAAGGTAG GCTGAGCACC GTGGCGGGCG GCAGCGGGCG GCGGTCGGGG TTGTTTCTGG 
9781 CGGAGGTGCT GCTGATGATG TAATTAAAGT AGGCGGTCTT GAGACGGCGG ATGGTCGACA 
9841 GAAGCACCAT GTCCTTGGGT CCGGCCTGCT 6AATGCGCAG GCGGTGGGCC ATGCCCCAGG 
9901 CTTCGTTTTG ACATCGGCGC AGGTCTTTGT AGTAGTCTTG CATGA,GCCTT TCTACCGGCA 
9961 CTTCTTCTTC TCCTTCCTCT TGTCCTGCAT CTCTTGCATC TATCGCTGCG GCGGCGGCGG 
10021 AGTTTGGCCG TAGGTGGCGC CCTCTTCCTC CCATGCGTGT GACCCCGAAG CCCCTCATCG 
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10081 GCTGAAGCAG GGCCAGGTCG GC6ACAACGC GCTCGGCTAA TATGGCCTGC TGCACCTGCG 
10141 TGAGGGTAGA CTGGAAGTCG TCCATGTCCA CAAAGCGGTG GTATGCGCCC GTGTTGATGG 
10201 TGTAAGTGCA GTTGGCCATA ACGGACCAGT TAACGGTCTG GTCACCCGGC TGCGAGAGCT 
10261 CGGTGTACCT GAGACGCGAG TAAGCCCTTG AGTCAAAGAC GTAGTCGTTG CAAGTCCGCA 
10321 CCAGGTACTG GTATCCCACC AAAAAGTGCG GCGGCGGCTG GCGGTAGAGG GGCCAGCGTA 
10381 GGGTGGCCGG GGCTCCGGGG GCGAGGTCTT CCAACATAAG GCGAT6ATAT CCGTAGATGT 
10441 ACCTGGACAT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG TCACGGACGC 
10501 GGTTCCAGAT GTTGCGCAGC GGCAAAAAGT GCTCCATGGT CGGGACGCTC TGGCCGGTCA 
10561 GGCGCGCGCA GTCGTTGACG CTCTAGACCG TGCAAAAGGA GAGCCTGTAA GCGGGCACTC 
10621 TTCCGTGGTC TGGTGGATAA ATTCGCAAGG GTATCATGGC GGACGACCGG GGTTCGAACC 
10681 CCGGATCCGG CCGTCCGCCG TGATCCATGC GGTTACCGCC CGCGTGTCGA ACCCAGGTGT 
10741 GCGACGTCAG ACAACGGGGG AGCGCTCCTT TTGGCTTCCT TCCAGGCGCG GCGGATGCTG 
10801 CGCTAGCTTT TTTGGCCACT GGCCGCGCGC GGCGTAAGCG GTTAGGCTGG AAAGCGAAAG 
10861 CATTAAGTGG CTCGCTCCCT GTAGCCGGAG GGTTATTTTC CAAGGGTTGA GTCGCGGGAC 
10921 CCCCGGTTCG AGTCTCGGGC CGGCCGGACT GCGGCGAACG GGGGTTTGCC TCCCCGTCAT 
10981 GCAAGACCCC GCTTGCAAAT TCCTCCGGAA ACAGGGACGA GCCCCTTTTT TGCTTTTCCC 
11041 AGATGCATCC GGTGCTGCGG CAGATGCGCC CCCCTCCTCA GCAGCGGCAA GAGCAAGAGC 
11101 AGCGGCAGAC ATGCAGGGCA CCCTCCCCTT CTCCTACCGC GTCAGGAGGG GCAACATCCG 
11161 CGGCTGACGC GGCGGCAGAT GGTGATTACG AACCCCCGCG GCGCCGGACC CGGCACTACT 
11221 TGGACTTGGA GGAGGGCGAG GGCCTGGCGC GGCTAGGAGC GCCCTCTCCT GAGCGACACC 
11281 CAAGGGTGCA GCTGAAGCGT GACACGCGCG AGGCGTACGT GCCGCGGCAG AACCTGTTTC 
11341 GCGACCGCGA GGGAGAGGAG CCC6AGGAGA TGCGGGATCG AAAGTTCCAT GCAGGGCGCG 
11401 AGTTGCGGCA TGGCCTGAAC CGCGAGCGGT TGCTGCGCGA GGAGGACTTT GAGCCCGACG 
11461 CGCGGACCGG GATTAGTCCC GCGCGCGCAC ACGTGGCGGC CGCCGACCTG GTAACCGCGT 
11521 ACGAGCAGAC GGTGAACCAG GAGATTAACT TTCAAAAAAG CTTTAACAAC CACGTGCGCA 
11581 CGCTTGTGGC GCGCGAGGAG GTGGCTATAG 6ACTGATGCA TCTGTGGGAC TTTGTAAGCG 
11641 CGCTGGAGCA AAACCCAAAT AGCAAGCCGC TCATGGCGCA GCTGTTCCTT ATAGTGCAGC 
11701 ACAGCAGGGA CAACGAGGCA TTCAGGGATG CGCTGCTAAA CATAGTAGAG CCCGAGGGCC 
11761 GCTGGCTGCT CGATTTGATA AACATTCTGC AGAGCATAGT GGTGCAGGAG CGCAGCTTGA 
11821 GCCTGGCTGA CAAGGTGGCC GCCATTAACT ATTCCATGCT CAGTCTGGGC AAGTTTTACG 
11881 CCCGCAAGAT ATACCATACC CCTTACGTTC CCATAGACAA GGAGGTAAAG ATCGAGGGGT 
11941 TCTACATGCG CATGGCGCTG AAGGTGCTTA CCTTGAGCGA CGACCTGGGC GTTTATCGCA 
12001 ACGAGCGCAT CCACAAGGCC GTGAGCGTGA GCCGGCGGCG CGAGCTCAGC GACCGCGAGC 
12061 TGATGCACAG CCTGCAAAGG GCCCTGGCTG GCACGGGCAG CGGCGATAGA GAGGCCGAGT 
12121 CCTACTTTGA CGCGGGCGCT GACCTGCGCT GGGCCCCAAG CCGACGCGCC CT6GAGGCAG 
12181 CTGGGGCCGG ACCTGGGCTG GCGGTGGCAC CCGCGCGCGC TGGCAACGTC GGCGGCGTGG 
12241 AGGAATATGA CGAGGACGAT GAGTACGAGC CAGAGGACGG CGAGTACTAA GCGGTGATGT 
12301 TTCTGATCAG ATGATGCAAG ACGCAACGGA CCCGGCGGTG CGGGCGGCGC TGCAGAGCCA 
12361 GCCGTCCGGC CTTAACTCCA CGGACGACTG GCGCCAGGTC ATGGACCGCA TCATGTCGCT 
12421 GACTGCGCGC AACCCTGACG CGTTCCGGCA GCAGCCGCAG GCCAACCGGC TCTCCGCAAT 
12481 TCTGGAAGCG GTGGTCCCGG CGCGCGCAAA CCCCACGCAC GAGAAGGTGC TGGCGATCGT 
12541 AAACGCGCTG GCCGAAAACA GGGCCATCCG GCCCGATGAG GCCGGCCTGG TCTACGACGC 
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12601 GCTGCtTCAG CGCGTGGCTC GTTACAACAG CAGCAACGTG CAGACCAACC TGGACCGGCT 
12661 GGTGGGGGAT GTGCGCGAGG GGGTGGCGCA GCGTGAGCGC GCGCAGCAGC AGGGCAACCT ^ 
12721 GGGCTCCATG GTTGCACTAA ACGCCTTCCT GAGTACACAG CCCGCCAACG TGCCGCGGGG 
12781 ACAGGAGGAC TACACCAACT TTGTGAGCGC ACTGCGGCTA ATGGTGACTG AGACACCGCA 
; 12841 AAGTGAGGTG TATCAGTCCG GGCCAGACTA TTTTTTCCAG ACCAGTAGAC AAGGCCTGCA . 
12901 GACCGTAAAC CTGAGCCAGG CTTTCAAGAA CTTGCAGGGG CTGTGGGGGG TGCGGGCTCC 
12961 CACAGGCGAC CGCGCGACCG TGTCTAGCTT GCTGACGCCC AACTCGCGCC TGTTGCTGCT ' 
13021 GCTAATAGCG CCCTTCACGG ACAGTGGCAG CGTGTCCCGG GACACATACC TAGGTCACTT 
13081 GCTGACACTG TACCGCGAGG CCATAGGTCA GGCGCATGTG GACGAGCATA CTTTCCAGGA 
13141 GATTACAAGT GTTAGCCGCG CGCTGGGGCA GGAGGACACG GGCAGCCTGG AGGCAACCCT: 
13201 GAACTACCTG CTGACCAACC GGCGGCAAAA AATCCCCTCG TTGCACAGTT TA/^CAGCGA ^ 
13261 GGAGGAGCGC ATTTTGCGCT ATGTGCAGCA GAGCGTGAGC CTTAACCTGA TGCGCGACGG 
13321 GGTAACGCCC AGCGTGGCGC TGGACATGAC CGCGCGCAAC ATGGAACCGG GCATGTATGC 

; 13381 etCAAACCGG CCGTTTATCA ATCGCCTAAT GGACTACTTG CATCGCGCGG CCGCCGTGJ^ 
13441 CCCCGAGTAT TTCACCAATG CCATCTTGAA CCCGCACTGG CTACCGCCCC CTGGTTTCTA 
13501 CACCGGGGGA TTCGAGGTGC GCGAGGGTAA CGATGGATTC CTCTGGGACG ACATAGACGA 
13561 CAGCGTGTTT TCCCCGCAAC CGCAGACCCT GCTAGAGTTG CAACAACGCG AGCAGGCAGA 
13621 GGCGGCGCTG GGAAAGGAAA GCTTCCGCAG GCCAAGCAGC TTGTCCGATC TAGGCGCTGC 
13 681 GGCCCCGCGG TCAGATGCTA GTAGCCCATT TCCAAGCTTG ATAGGGTCTC TTACCAGCAC 

: i 13741 TCGCACCACC CGCCCGCGCC TGCTGGGCGA GGAGGAGTAC CTAAACAACT CGCTGCTGCA 
13801 GCCGCAGCGC GAAAAGAACC TGCCTCCGGC GTTTCCCAAC AACGGGATAG AGAGCCTAGT 
13861 GGACAAGATG AGTAGATGGA AGACGTATGC GCAGGAGCAC AGGGATGTGC CCGGCCCGCG 
13921 CCCGCCCACC CGTCGTCAAA GGCACGACCG TCAGCGGGGT CTGGTGTGGG AGGACGATGA 
13 981 CTCGGCAGAC GACAGCAGCG TCTTGGATTT GGGAGGGAGT GGCAACCCGT TTGCACACCT 
14041 TCGCCCCAGG CTGGGGAGAA TGTTTTATU^ AAAGCATGAT GCAAAATAAA AAACTCACCA 
14101 AGGCCATGGC ACCGAGCGTT GGTTTTCTTG TATTCCCCTT AGTATGCGGC GCGCGGCGAT 
14161 GTATGAGGAA GGTCCTCCTC CCTCCTACGA GAGCGTGGTG AGCGCGGCGC CAGTGGCGGC 
14221 GGCGCTGGGT TCACCCTTCG ATGCTCCCCT GGACCCGCCG TTCGTGCCTC CGCGGTACCT 
14281 GCGGCCTACC GGGGGGAGAA ACAGCATCCG TTACTCTGAG TTGGCACCCC TATTCGACAC 
14341 CACCCGTGTG TACCTTGTGG ACAACAAGTC AACGGATGTG GCATCCCTGA ACTACCAGAA 
14401 CGACCACAGC AACTTTCTAA CCACGGTCAT TCAAAACAAT GACTACAGCC CGGGGGAGGC 
14461 TUVGCACACAG ACCATCAATC TTGACGACCG GTCGCACTGG GGCGGCGACC TGAAAACCAT 
14521 CCTGCATACC AACATGCCAA ATGTGAACGA GTTCATGTTT ACCAATAAGT TTAAGGCGCG 
14581 GGTGATGGTG TCGCGCTCGC TTACTAAGGA CAAACAGGTG GAGCTGAAAT ACGAGTGGGT 
. 14641 GGAGTTCACG CTGCCCGAGG GCAACTACTC CGAGACCATG ACCATAGACC TTATGAACAA 
14701 CGCGATCGTG GAGCACTACT TGAAAGTGGG CAGGCAGAAC GGGGTTCTGG AAAGCGACAT 
14761 CGGGGTAAAG TTTGACACCC GCAACTTCAG ACTGGGGTTT GACCCAGTCA CTGGTCTTGT 
■ 14821 CATGCCTGGG GTATATACAA ACGAAGCCTT CCATCCAGAC ATCATTTTGC TGCCAGGATG 
14881 CGGGGTGGAC TTCACCCACA GCCGCCTGA6 CAACTTGTTG GGCATCCGCA AGCGGCAACC 
14941 CTTCCAGGAG GGCTTTAGGA TCACCTACGA TGACCTGGAG GGTGGTAACA TTCCCGCACT- 
15001 GTTGGATGTG GACGCCTACC AGGCAAGCTT GAAAGATGAC ACCGAACAGG GCGGGGGTGG 
15061 CGCAGGCGGC GGCAACAACA GTGGCAGCGG CGCGGAAGAG AACTCCAACG CGGCAGCTGC 
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15121 GGCAATGCAG CCGGTGGAGG ACATGAACGA TCATGCCATT CGCGGCGACA CCTTTGCCAC 
15181 ACGGGCGGAG GAGAAGCGCG CTGAGGCCGA GGCAGCGGCC GAAGCTGCCG CCCCCGCTGC 
15241 GGAGGCTGCA CAACCCGAGG TCGAGAAGCC TCAGAAGAAA CCGGTGATTA AACCCCTGAC 
15301 AGAGGACAGC AAGAAACGCA GTTACAACCT AATAAGCAAT GACAGCACCT TCACCCAGTA 
15361 CCGCAGCTGG TACCTTGCAT ACAACTACGG CGACCCTCAG GCCGGGATCC GCTCATGGAC 
15421 CCTGCTTTGC ACTCCTGACG TAACCTGCGG CTCGGAGCAG GTATACTGGT CGTTGCCCGA 
15481 CATGATGCAA GACCCCGTGA CCTTCCGCTC CACGCGCCAG ATCAGCAACT TTCCGGTGGT 
15541 GGGCGCCGAG CTGTTGCCCG TGCACTCCAA GAGCTTCTAC AACGACCAGG CCGTCTACTC 
15601 CCAGCTCATC CGCCAGTTTA CCTCTCTGAC CCACGTGTTC AATCGCTTTC CCGAGAACCA 
15661 GATTTTGGCG CGCCCGCCAG CCCCCACCAT CACCACCGTC AGTGAAAACG TTCCTGCTCT 
15721 CACAGATCAC GGGACGCTAC CGCTGCGCAA CAGCATCGGA GGAGTCCAGC GAGTGACCAT 
15781 TACTGACGCC AGACGCCGCA CCTGCCCCTA CGTTTACAAG GCCCTGGGCA TAGTCTCGCC 
15841 GCGC6TCCTA TCGAGCCGCA CTTTTTGAGC AAGCATGTCC ATCCTTATAT CGCCCAGCAA 
15901 TAACACAGGC TGGGGCCTGC GCTTCCCAAG CAAGATGTTT GGCGGGGCCyi AGAAGCGCTC 
15961 CGACCAACAC CCAGTGCGCG TGCGCGGGCA CTACCGCGCG CCCTGGGGCG CGCACAAACG 
16021 CGGCCGCACT GGGCGCACCA CCGTCGATGA CGCCATCGAC GCGGTGGTGG AGGAGGCGCG 
16081 CAACTACACG CCCACGCCGC CGCCAGTGTC CACCGTGGAC GCGGCCATTC AGACCGTGGT 
16141 GCGCGGAGCC CGGCGCTACG CTAAAATGAA GAGACGGCGG AGGCGCGTAG CACGTCGCCA 
16201 CCGCCGCCGA CCCGGCACTG CCGCCCAACG CGCGGCGGCG GCCCTGCTTA ACCGCGCACG 
16261 TCGCACCGGC CGACGGGCGG CCATGCGAGC CGCTCGAAGG CTGGCCGCGG GTATTGTCAC 
16321 TGTGCCCCCC AGGTCCAGGC GACGAGCGGC CGCCGCAGCA GCCGCGGCCA TTAGTGCTAT 
16381 GACTCAGGGT CGCAGGGGCA ACGTGTACTG GGTGCGCGAC TCGGTTAGCG GGCTGCGCGT 
16441 GCCCGTGCGC ACCCGCCCCC CGCGCAACTA GATTGCAATA AAAAACTACT TAGACTCGTA 
16501 CTGTTGTATG TATCCAGCGG CGGCGGCGCG CATCGAAGCT ATGTCCAAGC GCAAAATCAA 
16561 AGAA6AGATG CTCCAGGTCA TCGCGCCGGA GATCTATGGC CCCCCGAAGA AGGAAGAGCA 
16621 GGATTACAAG CCCCGAAAGC TAAAGCGGGT CAAAAAGAAA AAGAAAGATG ATGATGATCA 
16681 TGAACTTGAC GACGAGGTGG AACTGTTGCA CGCGACCGCG CCCAGGCGAC GGGTACAGTG 
16741 GAAAGGTCGA CGCGTAAGAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TGACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGT CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAGC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCACCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACCA CCAGTAGCAC 
17221 TAGTATTGCC ACTGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCGGCGGT 
17281 GGCAGATGCC GCG6TGCAGG CGGCCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GTGTTTCAGC CCCCCGGCGT CCGCGCCGTT CAAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATCG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGGCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
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17641 CATCGTTTAA AAGCCGGTCt TTGTGGTTCT TGCAGATATG GCCCTCAGCt GCCCSCCTCCG 
17701 TirrCCCGGtG CCGGGATtCG GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGG GGCATGCGTC GTGCGCAGCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT^ 
.17821 GCGCGGCGGT ATGCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGGA TGGGTGGCCT TGCAGGCGGA GAGACACtGA TTAAAAACAA GTTACATGTG 
17941 GAAAAATCAA AATAAAAGTC TGGACTGTCA GGCTCGGTTG GTGGTGTAAC TATTTTGTAG 
18001 AATGGAAGAC ATCAACTTTG GGTCACTGGG GGGGGGAGAC GGCTCGCGCG GGTTGATGGG 
18061 AAACTGGCAA GATATCGGCA CCAGCAATAT GAGGGGTGGC GGCTTCAGCT GGGGCTCGCt 
18121 GTGGAGGGGC ATTAAAAATT TCGGTTCGGG CGTTAAGAAG TATGGCAGGA AAGGCTGGAA. 
18181 GAGGAGCACA GGGGAiSATGG TGAGGGACAA GTTGAAAGAG CAAAATTTCC AACAAAAGGT 
18241 GGTAGATGGC GTGGCGTGTG GCATTAGGGG GGTGGTGGAG CTGGCGAACC AGGGAGTGCA 
18301 AAATAAGATT AAGAGTAAGC TTGATCGCCG GCCTCCGGTA GAGGAGCGTC CACGGGCGGT 
18361 GGAGACAGTG TGTGGAGAGG GGGGTGGCGA AAAGCGTCGG CGACCCGAGA GGGAAGAAAC 
18421 TGTGGTGACG GAAATAGAGG AGGGTGGGTC GfAGGAGGAG GGAGTAAAGC AAGGGGTGGG 
18481 CAGGACCGGT CGGATCGCGG CGATGGCTAC GGGAGTGCTG GGCGAGCAGA CACCGGTAAC 
18541 GGTGGAGGTG GCTGCCGGGG GGGAGAGGCA GGAGAAAGGT GTGGTGCGAG GCGGGTCCGC 
18601 C6TTGTTGTA AGCCGTCCTA GGCGGGGGTG CGTGCGCGGG GCCGGCAGGG GTGCGCGATC 
18661 GTTGCGGGGG GTAGGCAGTG GGAAGTGGCA AAGGAGAGTG AAGAGGATCG TGGGTTTGGG 
18721 GGTGCAATCG CTGAAGCGCC GACGATGCTT CTGATAGCTA ACGTGTCGTA TGTGTGTGAT 
18781 GTATGCGTCC ATGTCGCCGC GAGAGGAGCT GCTGAGCCGG CGGGGGCGGG CTTTGCAAGA 
18841 TGGGTACCCG TTGGATGATG CCGCAGTGGT CTTAGATGGA GATCTGGGGC GAGGACGCCT 
18901 GGGAGTAGCT GAGGGGGGGG CTGGTGGAGT TGGCCCGGGC CAGGGAGACG TAGTTGAGdG 
18961 TGAATAACAA GTTTAGAAAC GCCACGGTGG CGCCTACGCA CGACGTGACC ACAGACCGGT 
19021 GTCAGCGTTT GACGCTGCGG TTGATCGCCG TGGACCGCGA GGATAGTGGG TACTCGTACA 
19081 AGGGGCGGTT GAGCCTAGCt GTGGGTGATA AGCGTGTGGT AGACATGGGT TCGAGGTAGT 
19141 TTGACATCCG CGGCGTGGTG GAGAGGGGCG CTACTTTTAA GCCCTACTCT GGGACTGCGT 
19201 ACAACGCACT GGGGGGGAAG GGTGCCCCGA AGTCGTGCGA GTGGGAAGAA AATGAAACTG 
19261 GACAAGTGGA TGCTCAAGAA CTTGACGAAG AGGAGAATGA AGCCAATGAA GCTCAGGCGC 
19321 GAGAACAGGA AGAAGCTAAG AAAACGGATG TATATGGGGA GGCTGGACTG TCGGGAATAA 
19381 AAATAACTAA AGAAGGTCTA CAAATAGGAA GTGCCGAGGG CACAGTAGCA GGTGCCGGCA 
19441 AAGAAATTTT CGCAGACAAA ACTTTTCAAG GTGAACCAGA AGTAGGAGAA TCTCAATGGA 
19501 ACGAAGCGGA TGCGAGAGCA GGTGGTGGAA GGGTTCTTT^ AAAGAGAACT GCGATGAAAC 
19561 GCTGGTATGG GTGATACGGT AGAGGGAGGA ATTGGAAGGG GGGACAGGGG GTTATGGTTG 
19621 AAGAAAATGG TAAATTGGAA AGTGAAGTCG AAATGCAATT TTTTTCGAGA TCGACAAATG 
19681 GGAGAAATGA AGTTAAGAAT ATAGAAGGAA GAGTTGTATT GTACAGGGAA GATGTAAAGA 
19741 TGGAAAGTGG AGATAGTCAT GTTTGTTATA AAGCTAAAAT GGGGGATAAA AATGGCAAAG 
19801 TCATGCTTGG AGAAGAAGGA ATGGGAAACA GAGCAAATTA GATTGGTTTT AGAGACAATT 
19861 TTATTGGTCT CATGTATTAG AAGAGCAGAG GTAACATGGG TGTGCTTGGT GGTGAGGGAT 
19921 CGCAGTTGAA GGGTGTTGTA GATTTGGAAG ACAGAAAGAC AGAGGTGTCC TAGCAGCTTT 
19981 TGGTTGATTG AATTGGGGAG AGAAGAAGAT AGTTTTCAAT GTGGAATGAA GCTGTTGAGA 
20041 GCTATGATCG AGATGTGAGA ATTATTGAGA ACGATGGAAG TGAGGATGAG TTGGGAAATT 
2 Olio 1 ATTGGTTTGC TCTTGGTGGA ATTGGGATTA GTGAGAGTTT TCAAGCTGTT AAAAGAAGTG 
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20161 CTGCTAACGG GGACCAAGGC AATACTACCT GGCAAAAAGA TTCAACATTT GCAGAACGCA 
20221 ATGAAATAGG GGTGGGAAAT AACTTTGCCA TGGAAATTAA CCTGAATGCC AACCTATGGA 
20281 GAAATTTCCT TTACTCCAAT ATTGCGCTGT ACCTGCCAGA CAAGCTAAAA TACAACCCCA 
20341 CCAATGTGGA AATATCTGAC AACCCCAACA CCTACGACTA CATGAACAAG CGAGTGGTGG 
20401 CTCCTGGGCT TGTAGACTGC TACATTAACC TTGGGGCGCG CTGGTCTCTG GACTACATGG 
20461 ACAACGTTAA TCCCTTTAAC CACCACCGCA ATGCGGGCCT GCGTTACCGC TCCATGTTGT 
20521 TGGGAAACGG CCGCTACGTG CCCTTTCACA TTCAGGTGCC CCAAAAGTTT TTTGCCATTA 
20581 AAAACCTCCT CCTCCTGCCA GGCTCATACA CATATGAATG GAACTTCAGG AAGGATGTTA 
20641 ACATGGTTCT GCAGAGCTCT CTGGGAAACG ACCTTAGAGT TGACGGGGCT AGCATTAAGT 
20701 TTGACAGCAT TTGTCTTTAC GCCACCTTCT TCCCCATGGC CCACAACAC6 GCCTCCACGC 
20761 TGGAAGCCAT GCTCAGAAAT GACACCAACG ACCAGTCCTT TAATGACTAC CTTTCCGCCG 
20821 CCAACATGCT ATATCCCATA CCCGCCAACG CCACCAACGT GCCCATCTCC ATCCCATCGC 
20881 GCAACTGGGC AGCATTTCGC GGTTGGGCCT TCACACGCTT GAAGACAAAG GAAACCCCTT 
20941 CCCTGGGATC AGGCTACGAC CCTTACTACA CCTACTCTGG CTCCATACCA TACCTTGACG 
21001 GAACCTTCTA TCTTAATCAC ACCTTTAAGA AGGTGGCCAT TACTTTTGAC TCTTCTGTTA 
210 61 GCTGGCCGGG CAACGACCGC CTGCTTACTC CCAATGAGTT TGAGATTAAG CGCTCAGTTG - 
21121 ACGGGGAGGG CTATAACGTA GCTCAGTGCA ACATGACAAA GGACTGGTTC CTAGTGCAGA 
21181 TGTTGGCGAA CTACAATATT GGCTACCAGG GCTTCTACAT TCCAGAAAGC TACAAAGACC 
21241 GCATGTACTC GTTCTTCAGA AACTTCCAGC CCATGAGCCG GCAAGTGGTG GACGATACTA 
21301 AATACAAAGA TTATCAGCAG GTTGGAATTA TCCACCAGCA TAACAACTCA GGCTTCGTAG 
21361 GCTACCTCGC TCCCACCATG CGCGAGGGAC AAGCTTACCC CGCTAATGTT CCCTACCCAC 
21421 TAATAGGCAA AACCGCGGTT GATAGTATTA CCCAGAAAAA GTTTCTTTGC GACCGCACCC 
21481 TGTGGCGCAT CCCCTTCTCC AGTAACTTTA TGTCCATGGG TGCGCTCACA GACCTGGGCC 
21541 AAAACCTTCT CTACGCAAAC TCCGCCCACG CGCTAGACAT GACCTTTGAG GTGGATCCCA 
21601 TGGACGAGCC CACCCTTCTT TATGTTTTGT TTGAAGTCTT TGACGTGGTC CGTGTGCACC 
21661 AGCCGCACCG CGGCGTCATC GAGACCGTGT ACCTGCGCAC GCCCTTCTCG GCCGGCAACG 
21721 CCACAACATA AAGAAGCAAG CAACATCAAC AACAGCTGCC GCCATGGGCT CCAGTGAGCA 
21781 GGAACTGAAA GCCATTGTCA AAGATCTTGG TTGTGGGCCA TATTTTTTGG GCACCTATGA 
21841 CAAGCGCTTC CCAGGCTTTG TTTCCCCACA CAAGCTCGCC TGCGCCATAG TTAACACGGC 
21901 CGGTCGCGAG ACTGGGGGCG TACACTGGAT GGCCTTTGCC TGGAACCCGC GCTCAAAAAC 
21961 ATGCTACCTC TTTGAGCCCT TTGGCTTTTC TGACCAACGT CTCAAGCAGG TTTACCAGTT 
22021 TGAGTACGAG TCACTCCTGC GCCGTAGCGC CATTGCCTCT TCCCCCGACC GCTGTATAAC 
22081 GCTGGAAAAG TCCACCCAAA GCGTGCAGGG GCCCAACTCG GCCGCCTGTG GCCTATTCTG 
22141 CTGCATGTTT CTCCACGCCT TTGCCAACTG GCCCCAAACT CCCATGGATC ACAACCCCAC 
22201 CATGAACCTT ATTACCGGGG TACCCAACTC CATGCTTAAC AGTCCCCAGG TACAGCCCAC 
22261 CCTGCGCCGC AACCAGGAAC AGCTCTACAG CTTCCTGGAG CGCCACTCGC CCTACTTCCG 
22321 CAGCCACAGT GCGCAAATTA GGAGCGCCAC TTCTTTTTGT CACTTGAAAA ACATGTAAT^ 
22381 ATAATGTACT AGGAGACACT TTCAATAAAG GCAAATGTTT TTATTTGTAC ACTCTCGGGT 
22441 GATTATTTAC CCCCACCCTT GCCGTCTGCG CCGTTTAAAA ATCAAAGGGG TTCTGCCGCG 
22501 CATCGCTATG CGCCACTGGC AGGGACACGT TGCGATACTG GTGTTTAGTG CTCCACTTAA 
22561 ACTCAGGCAC AACCATCCGC GGCAGCTCGG TGAAGTTTTC ACTCCACAGG CTGCGCACCA 
22621 TCACCAACGC GTTTAGCAGG TCGGGCGCCG ATATCTTGAA GTCGCAGTTG GGGCCTCCGC 



FIG. 71 



W603/031S88 



PGT/US02/32512 



61/92 

22681 CCTGCGCGCG CGAGTTGCGA TACACAGGGt TACAGCACTG GAACACTATC AGCGCCGGGT: 
22741 GGTGCACGCT GGCCAGCACG CTCTTGTCGG AGATCAGATG CGGGTCCAGG TCCTCCGCGT 
22801 TGCTCAGGGC GAACGGAGTC AACTTTGGTA GCTGCCTTCC CAAAAAGGGT GCATGCCCAG 
22861 GCTTTGAGTT GCACTCGCAC CGTAGTGGCA TCAGAAGGTG ACCGTGCCCA GTCTGGGCGT 
22921 TAGGATACAG CGCCTGCATG AAAGCCTTGA TCTGCTTAAA AGCCACCTGA GCCTTTGCGC 
22981 CTTCAGAGAA GAACATGCCG CAAGACTTGC CGGAAAACTG ATTGGCCGGA CAGGCCGCGT. 
23041 CATGCACGCA GCACCTTGCG TCGGTGTTGG AGATCTGCAC CACATTTCGG CGCCACCGGT 
23101 TCTTCACGAT CTTGGCCTTG CTAGACTGCT CCTTCAGCGC GCGCTGCCCG TTTTCGCTCG 
23161 TCACATCCAT TTC7VATCACG TGCTCCTTAT TTATCATAAT GCTCCCGTGT AGACACTTAA 
23221 GCTCGCCTTC GATCTCAGCG CAGCGGTGCA GCCACAACGC GCAGCCCGTG GGCTCGTCIGT 
23281 GCTTGTAGGT TACCTCTGCA AACGACTGCA GGTACGCCTG CAGGAATCGC CCCATCATCG 
23341 TCACAAAGGT CTTGTTGCTG GTGAAGGTCA GCTGCAACCC GCGGTGCTCC TCGTTTAGCC 
23401 AGGTCTTGCA TACGGCCGCC AGAGCTTCCA CTTGGTCAGG CAGTAGCTTG AAGTTTGCCT 
23461 TTAGATCGTT ATCCACGTGG TACTTGTCCA TCAACGCGCG CGCAGCCTCC ATGCCCTTCT 
23521 CCCACGCAGA CACGATCGGC AGGCTCAGCG GGTTTATCAC CGTGCTTTCA CTTTCCGCTT 
23581 CACTGGACTC TTCCTTTTCC TCTTGCATCC GCATACCCCG CGCCACTGGG TCGTCTTCAT 
23641 TCAGCCGCCG CACCGTGCGC TfACCTCCCT TGCCGTGCTT GATTAGCACC GGTGGGTTGC 
23701 TGAAACCC AC CATTTGTAGC GCCACATCTT CTCTTTCTTC CTCGCTGTCC ACGATCACCT 
23761 CTGGGGATGG CGGGCGCTCG GGCTTGGGAG AGGGGCGCTT CTTTTTCTTT TTGGACGCAA 
23821 TGGCCAAATC CGCCGTCGAG GTCGATGGCC GCGGGCTGGG TGTGCGCGGC ACCAGCGCAT 
23881 CTTGTGACGA GTCTTCTTCG TCCTCGGACT CGAGACGCCG CCTCAGCCGC TTTTTTGGGG 
23941 GCGCGCGGGG AGGCGGCGGC GACGGCGACG GGGACGAGAC GTCCTCCATG GTTGGTGGAC 
24001 GTCGCGCCGC ACCGCGTCCG CGCTCGGGGG TGGTTTCGCG CTGCTCCTCT TCCCGACTGG 
24061 CCATTTCCTT CTCCTATAGG CAGAAAAAGA TCATGGAGTC AGTCGAGAAG GAGGACAGCC 
24121 TAACCGCCCC CTTTGAGTTC GCCACCACCG CCTCCACCGA TGCCGCCAAC (SCGCCTACCA 
24181 CCTTCCCCGT CGAGGCACCC CCGCTTGAGG AGGAGGAAGT GATTATCGAG CAGGACCCAG 
24241 GTTTTGTAAG CGAAGACGAC GAAGATCGCT CAGTACCAAC AGAGGATAAA AAGCAAGACC 
24301 AGGACGACGC AGAGGCAAAC GAGGAACAAG TCGGGCGGGG GGACCAAAGG CATGGCGACT 
24361 ACCTAGATGT GGGAGACGAC GTGCTGTTGA AGCATCTGCA GCGCCAGTGC GCCATTATCT 
24421 GCGACGCGTT GCAAGAGCGC AGCGATGTGC CCCTCGCCAT AGCGGATGTC AGCCTTGCCT 
24481 ACGAACGCCA CCTGTTCTCA CCGCGCGTAC CCCCCAAACG CCAAGAAAAC GGCACATGICG 
24541 AGCCCAACCC GCGCCTCAAC TTCTACCCCG TATTTGCCGT GCCAGAGGTG CTTGCCACCT 
24601 ATCACATCTT TTTCCAAAAC TGCAAGATAC CCCTATCCTG CCGTGCCAAC CGCAGCCGAG 
24661 CGGACAAGCA GCTGGCCTTG CGGCAGGGCG CTGTCATACC TGATATCGCC TCGCTCGACG 
24721 AAGTGCCAAA AATCTTTGAG GGTCTTGGAC GCGACGAGAA GCGCGCGGCA AACGCTCTGC 
24781 AACAAGAAAA CAGCGAAAAT GAAAGTCACT GTGGAGTGCT GGTGGAACTT GAGGGTGACA 
24841 ACGCGCGCCT AGCCGTGCTG AAACGCAGCA TCGAGGTCAC CCACTTTGCC TACCCGGCAC 
24901 TTAACCTACC CCCCAAGGTT ATGAGCACAG TCATGAGCGA GCTGATCGTG CGCCGTGCAC 
24961 GACCCCTGGA GAGGGATGCA AACTTGCAAG AACAAACCGA GGAGGGCCTA CCCGCAGTTG 
25021 GCGATGAGCA GCTGGCGCGC TGGCTTGAGA CGCGCGAGCC TGCCGACTTG GAGGAGCGAC 
25081 GCAAGCTAAT GATGGCCGCA GTGCTTGTTA CCGTGGAGCT TGAGTGCATG CAGCGGTTCT 
25141 TTGCTGACCC GGAGATGCAG CGCAAGCTAG AGGAAACGTT GCACTACACC TTTCGCCAGG 
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25201 GCTACGTGCG CCAGGCCTGC AAAATTTCCA ACGTGGAGCT CTGCAACCTG GTCTCCTACC 
25261 TTGGAATTTT GCACGAAAAC CGCCTTGGGC AAAACGTGCT TCATTCCACG CTCAAGGGCG 
25321 AGGCGCGCCG CGACTACGTC CGCGACTGCG TTTACTTATT TCTGTGCTAC ACCTGGCAAA 
25381 CGGCCATGGG CGTGTGGCAG CAGTGCCTGG AGGAGCGCAA CCTCAAGGAG CTGCAGAAGC 
25441 TGCTAAAGCA AAACTTGAAG GACCTATGGA CGGCCTTCAA CGAGCGCTCC GTGGCCGCGC 
25501 ACCTGGCGGA CATTATCTTC CCCGAACGCC TGCTTAAAAC CCTGCAACAG GGTCTGCCAG 
25561 ACTTCACCAG TCAAAGCATG TTGCAAAACT TTAGGAACTT TATCCTAGAG CGTTCAGGAA 
25621 TTCTGCCCGC CACCTGCTGT GCGCTTCCTA GCGACTTTGT GCCCATTAAG TACCGTGAAT 
25681 GCCCTCCGCC GCTTTGGGGT CACTGCTACC TTCTGCAGCT AGCCAACTAC CTTGCCTACC 
25741 ACTCCGACAT CATGGAAGAC GTGAGCGGTG ACGGCCTACT GGAGTGTCAC TGTCGCTGCA 
25801 ACCTATGCAC CCCGCACCGC TCCCTGGTCT GCAATTCACA ACTGCTTAGC GAAAGTCAAA 
25861 TTATCGGTAC CTTTGAGCTG CAGGGTCCCT CGCCTGACGA AAAGTCCGCG GCTCCGGGGT 
25921 TGAAACTCAC TCCGGGGCTG TGGACGTCGG CTTACCTTCG CAAATTTGTA CCTGAGGACT 
25981 ACCACGCCCA CGAGATTAGG TTCTACGAAG ACCAATCCCG CCCGCCAAAT GCGGAGCTTA 
26041 CCGCCTGCGT CATTACCCAG GGCCACATCC TTGGCCAATT GCAAGCCATT AACAAAGCCC 
26101 GCCAAGAGTT TCTGCTACGA AAGGGACGGG GGGTTTACTT GGACCCCCAG TCCGGCGAGG 
26161 AGCTCAACCC AATCCCCCCG CCGCCGCAGC CCTATCAGCA GCCGCGGGCC CTTGCTTCCC 
26221 AGGATGGCAC CCAAAAAGAA GCTGCAGCTG CCGCCGCCGC CACCCACGGA CGAGGAGGAA 
26281 TACTGGGACA GTCAGGCAGA GGAGGTTTTG GACGAGGAGG AGGAGATGAT GGAAGACTGG 
26341 GACAGCCTAG ACGAGGAAGC TTCCGAGGCC GAAGAGGTGT CAGACGAAAC ACCGTCACCC 
26401 TCGGTCGCAT TCCCCTCGCC GGCGCCCCAG AAATCGGCAA CCGTTCCCAG CATTGCTACA 
26461 ACCTCCGCTC CTCAGGCGCC GCCGGCACTG CCCGTTCGCC GACCCAACCG TAGATGGGAC 
26521 ACCACTGGAA CCAGGGCCGG TAAGTCTAAG CAGCCGCCGC CGTTAGCCCA AGAGCAACAA 
26581 CAGCGCCAAG GCTACCGCTC GTGGCGCGTG CACAAGAACG CCATAGTTGC TTGCTTGCAA 
26641 GACTGTGGGG GCAACATCTC CTTCGCCCGC CGCTTTCTTC TCTACCATCA CGGCGTGGCC 
26701 TTCCCCCGTA ACATCCTGCA TTACTACCGT CATCTCTACA GCCCCTACTG CACCGGCGGC 
26761 AGCGGCAGCA ACAGCAGCGG CCACGCAGAA GCAAAGGCGA CCGGATAGCA AGACTCTGAC 
26821 AAAGCCCAAG AAATCCACAG CGGCGGCAGC AGCAGGAGGA GGAGCACTGC GTCTGGCGCC 
26881 CAACGAACCC GTATCGACCC GCGAGCTTAG AAACAGGATT TTTCCCACTC TGTATGCTAT 
26941 ATTTCAACAG AGCAGGGGCC AAGAACAAGA GCTGAAAATA AAAAACAGGT CTCTGCGCTC 
27001 CCTCACCCGC AGCTGCCTGT ATCACAAAAG CGAAGATCAG CTTCGGCGCA CGCTGGAAGA 
27061 CGCGGAGGCT CTCTTCAGCA AATACTGCGC GCTGACTCTT AAGGACTAGT TTCGCGCCCT 
27121 TTCTCAAATT TAAGCGCGAA AACTACGTCA TCTCCAGCGG CCACACCCGG CGCCAGCACC 
27181 TGTCGTCAGC GCCATTATGA GCAAGGAAAT TCCCACGCCC TACATGTGGA GTTACCAGCC 
27241 ACAAATGGGA CTTGCGGCTG GAGCTGCCCA AGACTACTCA ACCCGAATAA ACTACATGAG 
27301 CGCGGGACCC CACATGATAT CCCGGGTCAA CGGAATCCGC GCCCACCGAA ACCGAATTCT 
27361 CCTCGAACAG GCGGCTATTA CCACCACACC TCGTAATAAC CTTAATCCCC GTAGTTGGCC 
27421 CGCTGCCCTG 6TGTACCAGG AAAGTCCCGC TCCCACCACT GTGGTACTTC CCAGAGACGC 
27481 CCAGGCCGAA GTTCAGATGA CTAACTCAGG GGCGCAGCTT GCGGGCGGCT TTCGTCACAG 
27541 GGTGCGGTCG CCCGGGCAGG GTATAACTCA CCTGAAAATC AGAGGGCGAG GTATTCAGCT 
27601 CAACGACGAG TCGGTGAGCT CCTCTCTTGG TCTCCGTCCG GACGGGACAT TTCAGATCGG 
27661 CGGCGCTG6C CGCTCTTCAT TTACGCCCCG TCAGGCGATC CTAACTCTGC AGACCTCGTC • 
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27721 CTCGGAGCCG CGCTCCGGAG GCATTGGAAC TCTACAATTT ATTGAGGAGT TCGTGCC'TrC 
277 Si GGTTTACTTC AACCCCTTTT CTGGACCTGC GGGCCACTAd CGGGACCAGt TTATTCCCAA 
27841 CTTTGACGCG GTAAAAGACT CGGCGGACGG CTACGACTGA ATGACCAGtG GAGAGGCAGA 
27901 GCAACTGCGC CTGACACACC TCGACCACTG GCGCCGCCAG AAGTGCTTTG CCCGCGGCTC . 
27961 CGGTGAGTTT TGTTACTTTG AATTGCCCGA AGAGCATATC 6AGGGCCCGG CGCACGGCGT 
28021 CCGGCTCACC ACCCAGGTAG AGCTTACACG TAGCCTGATT CGGGAGTTTA CCAAGCGCCC 
28081 CCTGCTAGTG GAGCGGGAGC GGGGTCCCTG TGTTCTGACC GTGGTTTGCA ACTGTCCTAA: 

: 28141 CCCTGGATTA CATCAAGATC TTTGTTGTCA TCTCTGTGCT GAGTATAATA AATACAGAAA 
28201 TTAGAATCTA CTGGGGCTCC TGTCGCCATC CTGTGAACGC CACCGTTTTT ACCCACCCAA 
28261 AGCAGACCAA AGCAAACCTC ACCTCCGGTT TGCACAAGCG GGCCAATAAG TACCTTACCT 
28321 GGTACTTTAA CGGCTCTTCA TTTGTAATTT ACAACAGTTT CCAGCGAGAC GAAGTAAGTT 
28381 TGCCACACAA CCTTCTCGGC TTCAACTACA CCGTCAAGAA AAACACCACC ACCACCCTCC 
28441 TCACCTGCCG GGAACGTACG AGTGCGTCAC CGGTTGCTGC GCCCACACCT ACAGCCTGAG 
28501 CGTAACCAGA CATTACTCCC ATTTTCCCAA AACAGGAGGT GAGCTCAACT CCCGGAACTC 
28561 AGGTCAAAAA AGCATOTTGC GGGGTGCTGG GATTTTTTAA TTAAGTATAT GAGCAATTCA 
28621 AGTAACTCTA CAAGCTTGTC TAATTTTTCT GGAATTGGGG TCGGGGTTAT CCTTACTCTT: 
28681 GTAATTCTGT TTATTCTTAT ACTAGCACTT CTGtGCCTTA GGGTTGCCGC CTGCTGCACG 
28741 CACGTTTGTA CCTATTGfCA GCTTTTTAAA CGCTGGGGGC GACATCCAAG ATGAGGTACA 
28801 TGATTTTAGG CTTGCTCGCC CTTGCGGCAG tCTGCAGCGC TGCCAAAAAG GTTGAGTTTA = 
28861 AGGAACCAGC TTGCAATGTT ACATTTAAAT CAGAAGCTAA TGAATGCACT ACTCTTATAA 

: 28921 AATGCACCAC AGAACATGAA AAGCTTATTA TTCGCCACAA AGACA7U\ATT GGCAAGTATG- 
28981 CTGTATATGC TATTTGGCAG CCAGGTGACA CTAACGACTA TAATGTCACA GTCTTCCAAG 
29041 GTGAAAATCG TAAAACTTTT ATGTATAAAT TTCCATTTTA TGAAATGTGC GATATTACCA 
- 29101 TGTACATGAG CAAACAGTAC AAGTTGTGGC CCCCACAAAA GTGTTTAGAG AACACTGGCA 
29161 CCTTTTGTTC CACCGCTCTG CTTATTACAG CGCTTGCTTT GGTATGTACC TTACTTTATC 
29221 TCAAATACAA AAGCAGACGC AGTTTTATTG ATGAAAAGAA AATGCCTTGA TTTTCCGCTT 
29281 GCTTGTATTC CCCTGGACAA TTTACTCTAT GTGGGATATG CGCCAGGCGG GAAAGATTAT 
29341 ACCCACAACC TTCAAATCAA ACTTTCCTGG ACGTTAGCGC CTGACTTCTG CCAGCGCCTG 
29401 CACTGCAAAT TTGATCAAAC CCAGCTTCAG CTTGCCTGCT CCAGAGATGA CCGGCTCAAC 
29461 CATCGCGCCC ACAACGGACT ATCGCAACAC CACTGCTACC GGACTAAAAT CTGCCCTAAA 
29521 TTTACCCCAA GTTCATGCCT TTGTCAATGA CTGGGCGAGC TTGGGCATGT GGTGGTTTTC 
29581 CATAGCGCTT ATGTTTGTTT GCCTTATTAT TATGTGGCTT ATTTGTTGCC TAAAGCGCAG 
. 29 641 ACGCGCCAGA CCCCCCATCT ATAGGCCTAT CATTGTGCTC AACCCACACA ATGAAAAAAT 
29701 TCATAGATTG GACGGTCTCA AACCATGTTC TCTTCTTTTA CAGTATGATT AAATGAGACA 
29761 TGATTCCTCG AGTCCTTATA TTATTGACCC TTGTTGCGCT TTTCTGTGCG TGCTCTACAT 
29821 TGGCTGCGGT CGCTCACATC GAAGTAGATT GCATCCCACC TTTCACAGTT TACCTGCTTT 
29881 ACGGATTTGT CACCCTTATC CTCATCTGCA GCCTCGTCAC TGTAGTCATC GCCTTCATTC 
29941 AGTTCATTGA CTGGATTTGT GTGCGCATTG CGTACCTTAG GCACCATCCG CAATACAGAG 
30001 ACAGGACTAT AGCTGATCTT CTCAGAATTC TTTAATTATG AAACGGATTG TCACTTTTGT 
30061 TTTGCTGATT TTCTGCGCCC TACCTGTGCT TTGCTCCCAA ACCTCAGCGC CTCCCAAAAG 
30121 ACATATTTCC TGCAGATTCA CTCAAATATG GAACATTCCC AGCTGCTACA ACAAACAGAG 
30181 CGATTTGTCA GAAGCCTGGT TATACGCCAT CATCTCTGTC ATCGTTTTTT gcagtaccat 
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30241 TTTTGCCCTA GCCATATACC CATACCTTGA CATTGGTTGG AATGCCATAG ATGCCATGAA 
30301 CCACCCTACT TTCCCAGCGC CCAATGTCAT ACCACTGCAA CAGGTTATTG CCGCAATCAA 
30361 TCAGCCTCGC CCCCCTTCTC CCACCCCCAC TGAGATTAGC TACTTTAATT TGACAGGTGG 
30421 AGATGACTGA ATCTCTAGAT CTAGAATTGG ATGGAATTAA CACCGAACAG CGCCTACTAG 
30481 AAAGGCGCAA GGCGGCGTCC GAGCGAGAAC GCCTAAAACA AGAAGTTGAA GACATGGTTA 
30541 ACCTGCACCA GTGTAAAAGA GGTATCTTTT GTGTGGTCAA GCAGGCCAAA CTTACCTACG 
30601 AAAAAACCAC TACCGGCAAC CGCCTTAGCT ACAAGCTACC CACCCAGCGC CAAAAACTGG 
30661 TGCTTATGGT GGGAGAAAAA CCTATCACCG TCACCCAGCA CTCGGCAGAA ACAGAAGGCT 
30721 GCCTGCACTT CCCCTATCAG GGTCCAGAGG ACCTCTGCAC TCTTATTAAA ACCATGTGTG 
30781 GCATTAGAGA TCTTATTCCA TTCAACTAAC AATAAACACA CAATAAATTA CTTACTTAAA- 
30841 ATCAGTCAGC AAATCTTTGT CCAGCTTATT CAGCATCACC TCCTTTCCCT CCTCCCAACT 
309 01 CTGGTATTTC AGCAGCCTTT TAGCTGCGAA CTTTCTCCAA AGTCTAAATG GGATGTCAAA 
30961 TTCCTCAT6T TCTTGTCCCT CCGCACCCAC TATCTTCATA TTGTTGCAGA TGAAACGCGC 
31021 CAGACCGTCT GAAGACACCT TCAACCCTGT GTACCCATAT GACACGGAAA CCGGCCCTCC. 
31081 AACTGTGCCT TTCCTTACCC CTCCCTTTGT GTCGCCAAAT GGGTTCCAAG AAAGTCCCCC 
31141 CGGAGTGCTT TCTTTGCGTC TTTCAGAACC TTTGGTTACC TCACACGGCA TGCTTGCGCT 
31201 AAAAATGGGC AGCGGCCTGT CCCTGGATCA GGCAGGCAAC CTTACATCAA ATACAATCAC 
31261 TGTTTCTCAA CCGCTAAAAA AAACAAAGTC CAATATAACT TTGGA7UVCAT CCGCGCCCCT 
31321 TACAGTCAGC TCAGGCGCCC TAACCATGGC CACAACTTCG CCTTTGGTGG TCTCTGACAA 
31381 CACTCTTACC ATGCAATCAC AAGCACCGCT AACCGTGCAA GACTCAAAAC TTAGCATTGC 
31441 TACCAAAGAG CCACTTACAG TGTTAGATGG AAAACTGGCC CTGCAGACAT CAGCCCCCCT 
31501 CTCTGCCACT GATAACAACG CCCTCACTAT CACTGCCTCA CCTCCTCTTA CTACTGCAAA 
31561 TGGTAGTCTG GCTGTTACCA TGGAAAACCC ACTTTACAAC AACAATGGAA AACTTGGGCT 
31621 CAAAATTGGC GGTCCTTTGC AAGTGGCCAC CGACTCACAT GCACTAACAC TAGGTACTGG 
31681 TCAGGGGGTT GCAGTTCATA ACAATTTGCT ACATACAAAA GTTACAGGCG CAATAGGGTT 
31741 TGATACATCT GGCAACATGG AACTTAAAAC TGGAGATGGC CTCTATGTGG ATAGCGCCGG 
31801 TCCTAACCAA AAACTACATA TTAATCTAAA TACCACAAAA GGCCTTGCTT TT6ACAACAC 
31861 CGCAATAACA ATTAACGCTG GAAAAGGGTT GGAATTTGAA ACAGACTCCT CAAACGGAAA 
31921 TCCCATAAAA ACAAAAATTG GATCAGGCAT ACAATATAAT ACCAATGGAG CTATGGTTGC 
31981 AAAACTTGGA ACAGGCCTCA GTTTTGACAG CTCCGGAGCC ATAACAATGG GCAGCATAAA 
32041 CAATGACAGA CTTACTCTTT GGACAACACC AGACCCATCC CCAAATTGCA GAATTGCTTC 
32101 AGATAAAGAC TGCAAGCTAA CTCTGGCGCT AACAAAATGT GGCAGTCAAA TTTTGGGCAC 
32161 TGTTTCAGCT TTGGCAGTAT CAGGTAATAT GGCCTCCATC AATGGAACTC TAAGCAGTGT 
32221 AAACTTGGTT CTTAGATTTG ATGACAACGG AGTGCTTATG TCAAATTCAT CACTGGACAA 
32281 ACAGTATTGG AACTTTAGAA ACGGGGACTC CACTAACGGT CAACCATACA CTTATGCTGT 
32341 TGGGTTTATG CCAAACCTAA AAGCTTACCC AAAAACTCAA AGTAAAACTG CAAAAAGTAA 
32401 TATTGTTAGC CAGGTGTATC TTAATGGTGA CAAGTCTAAA CCATTGCATT TTACTATTAC 
32461 GCTAAATG6A ACAGATGAAA CCAACCAAGT AAGCAAATAC TCAATATCAT TCAGTTGGTC 
32521 CTGGAACAGT GGACAATACA CTAATGACAA ATTTGCCACC AATTCCTATA CCTTCTCCTA 
32581 CATTGCCCAG GAATAAAGAA TCGTGAACCT GTTGCATGTT ATGTTTCAAC GTGTTTATTT 
32641 TTCAATTGCA GAAAATTTCA AGTCATTTTT CATTCAGTAG TATAGCCCCA CCACCACATA 
32701 GCTTATACTA ATCACCGTAC CTTAATCAAA CTCACAGAAC CCTAGTATTC AACCTGCCAC 
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32761 CTCCCTCCCA ACACACASAG TACACAGTCC TTTCTCCCCG GCTGGCCTTA AACAGCATGA : 
32B21 tATCATGGGT AAGAGACATA TTCTTAGGT6 TTATATTGCA GACGGTCTCC TGTCGAGCCA 
32881 AACGCTCATC AGTGATGTTA ATAAACTCCC CGGGCAGCTC GCTTAAGTTC ATGTCGCTGT : 
32941 CCAGCTGCTG AGCCACAGGG TGCTGTCCAA CTTGCGGTTG CTCAACGGGC GGCGAAGGAiS 
33001 AAGTCCACGC CtACATGGGG GTAGAGTCAT AATCGTGCAT CAGGATAGGG CGGTGGTGCT 
33061 GCAGCAGCGC GCGAATAAAC TGCtGCCGCC GCCGCTCCGT CGTGCAGGAA TACAACATGG 
33121 CAGTGGTCTC CTCAGCGATG ATTCGCACCG CCCGCAGCAT AAGGCGCCTT GTCCTCCGGG 
33181 CACAGCAGCG CACCCTGATC TCAGTTAAGT CAGCACAGTA ACTGCAGCAC AGTACCACAA- 
33241 TATTGTTTAA AATCCCACAG TGCAAGGCGC TGTATCCAAA GCTCATGGCG GGGACCACAG 
33301 AACCCACGTG GCCATCATAC CACAAGCGCA GGTAGATTAA GTGGCGACCC CTCATAAACA- 
33361 CGCTGGACAT AAACATTACC TCTTTTGGCA TGTTGTAATT CACCACCTCC CGGTACCATA 
33421 TAAACCTCTG ATTAAACATG GCGCCATCCA CCACCATCCT AAACCAGCTG GCCAAAACCT 
33481 GCCCGCCGGC TATGCACTGC AGGGAACCGG GACTGGAACA ATGACAGTGG AGAGCCCAGG 
33541 ACTCGTAACC ATGGATCATC ATGCTCGtCA TGATATCAAT GTTGGCACAA CACAGGCACA 
33601 CGTGCATACA CTTCCTCAGG ATTACAAGCT CCTCCCGCGT CAGAACCATA TCCCAGGGAA 
33661 CAACCCATTC CTGAATCAGC GTAAATCCCA CACTGCAGGG AAGACCTCGC ACGTAACTCA 
33721 CGTTGTGCAT TGTCAAAGTG TTACATTCGG GCAGCAGCGG ATGATCCTCC AGTATGGTAG 
33781 CGCGTGTCTC TGTCTCAAAA GGAGGTAGGC GATCCCTACT GTACGGAGTG CGCCGAGACA 
33841 ACCGAGATCG TGTTGGTCGT AGTGTCATGC CAAATGGAAC GCCGGACGTA GTCATATTTC 
33901 CTGAAGCAAA ACCAGGTGCG GGCGTGACAA ACAGATCTGC GTCTCCGGTC TCGTCGCTTA 
33961 GCTCGCTCTG TGTAGTAGTT GTAGTATATC CACTCTCTCA. AAGCATCCAG GCGCCCCCTG 
34021 GCTTCGGGTT CTATGTAAAC TCCTTCATGC GCCGCTGCCC TGATAACATC CACCACCGCA 
34081 GAATAAGCCA CACCCAGCCA ACCTACACAT TCGTTCTGCG AGTCACACAC GGGAGGAGCG 
34141 GGAAGAGCTG GAAGAACCAT GTTTTTTTTT TTTATTCCAA AAGATTATCC AAAACCTCAA 
34201 AATGAAGATC TATTAAGTGA ACGCGCTCCC CTCCGGTGGC GTGGTCAAAC TCTACAGCCA 
34261 AAGAACAGAT AATGGCATTT GTAAGATGTT GCACAATGGC TTCCAAAAGG CAAACTGCCC 
34321 TCACGTCCAA GTGGACGTAA AGGCTAAACC CTTCAGGGTG AATCTCCTCT ATAAACATTC 
34381 CAGCACCTTC AACCATGCCC AAATAATTTT CATCTCGCCA CCTTATCAAT ATGTCTCTAA 
34441 GCAAATCCCG AATATTAAGT CCGGCCATTG TAAAAATCTG CTCCAGAGCG CCCTCCACCT 
34501 TCAGCCTCAA GCAGCGAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA CCTGTATAAG 
34561 ATTCAAAAGC GGAACATTAA CAAAAATACC GCGATCCCGT AGGTCCCTTC GCAGGGCCAG 
34621 CTGAACATAA TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC CAGGAACCAT 
34681 GACAAAAGAA CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA CCAGCGTAGC 
34741 CCCGATGTAA GCTTGTTGCA TGGGCGGCGA TATAAAATGC AAGGTACTGC TCAAAAAATC 
34801 AGGCAAAGCC TCGCGCAAAA AAGCAAGCAC ATCGTAGTCA TGCTCATGCA GATAAAGGCA 
34861 GGTAAGTTCC GGAACCACCA CAGAAATAGA CACCATTTTT CTCTCAAACA TGTCTGCGGG 
34921 TTCCTGCATA AACACAAAAT AAAATAACAA AAAAAAAAAA ACATTTAAAC ATTAGAAGCC 
34981 TGTNTTACAA CAGGAAAAAC AACCCTTATA AGCATAAGAC GGACTACGGC CATGCCGGCG 
35041 TGACCGTAAA AAAACTGGTC ACCGTGATTA AAAAGCACCA CCGACAGTTC CTCGGTCATG 
35101 TCCGGAGTCA TAATGTAAGA CTCGGTAAAC ACATCAGGTT GGTTAACATC GGTCA6TGCT 
35161 AAAAAGCGAC CGAAATAGCC CGGGGGAATA CATACCCGCA GGCGTAGAGA CAACATTACA 
35221 GCCCCCATAG GAGGTATAAC AAAATTAATA GGAGAGAAAA ACACATAAAC ACCTGAAAAA 
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35281 CCCTCCTGCC TAGGCAAAAT AGCACCCTCC CGCTCCAGAA CAACATACAG CGCTTCCACA 
35341 GCGGCAGCCA TAACAGTCAG CCTTACCAGT AAAAAAACCT ATTAAAAAAC ACCACTCGAC 
35401 ACGGCACCAG CTCAATCAGT CACAGTGTAA AAAGGGCCAA GTACAGAGCG AGTATATATA 
35461 GGACTAAAAA ATGACGTAAC GGTTAAAGTC CACAAAAACC ACCCAGAAAA CCGCACGCGA 
35521 ACCTACGCCC ASAAACGAAA GCCAAAAAAC CCACAACTTC CTCAAATCTT CACTTCCGTT 
35581 TtCCCACGAT ACGTCACTTC CCATTTTAAA AAAAAACTAC AATTCCCAAT ACATGCAAGT 
35641 TACTCCGCCC TAAAACCTAC GTCACCCGCC CCGTTCCCAC GCCCCGCGCC ACGTCACAAA 
35701 CTCCACCCCC TCATTATCAT ATTGGCTTCA ATCCAAAATA AGGTATATTA TTGATGATG 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT' 
121 GATGTTGCAA GTGTGGfCGGA ACACATGTAA GCGACGGATG TGGCAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACAG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG . 
241 TAAATTTGGG CGTAACCGAG TAAGATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA ; 
301 AGTGAAATCT GAATAATTTT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAA6 TTGGCGTTTT ATTATTATAG TCAGCTGACG TGTAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC .. 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GACTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAC TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTT6 GGTCCGGTTT CTATGCCAAA 
901 CCTTGTACCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAATTATGGG CAGTGGGTGA 
1141 TAGAGTGGTG GGTTTGGTGT GGTAATTTTT TTTTTAATTT TTACAGTTTT GTGGTTTAAA 
1201 GAATTTTGTA TTGTGATTTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGC CGTCCTAAAA TGGCGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 CCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCATCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT TGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCGA TAATACCGAC GGAGGAGCAG 
2161 CAGCAGCAGC AGGAGGAAGC CAGGCGGCGG CGGCAGGAGC AGAGCCCATG GAACCCGAGA 
2221 GCCGGCCTGG ACCCTCGGGA ATGAATGTTG TACAGGTGGC TGAACTGTAT CGAGAACTGA 
2281 GACGCATTTT GACAATTACA GAGGATGGGC AGGGGCTAAA GGGGGTAAAG AGGGAGCGGG 
2341 GGGCTTGTGA GGCTACAGAG GAGGCTAGGA ATCTAGCTTT TAGCTTAATG ACCAGACACC 
2401 GTCCTGA6TG TATTACTTTT CAACAGATCA AGGATAAtTG CGCTAATGAG CTTGATCTGC 
2461 TGGCGCAGAA GTATTCCATA GAGCAGCTGA CCACTTACTG GCTGCAGCCA GGGGATGATT 
2521 TTGAGGAGGC TATTAGGGTA TATGCAAAGG TGGCACTTAG GCCAGATTGC AAGTACAAGA 
2581 TCAGCAAACT TGTAAATATC AGGAATTGTT GCTACATTTC TGGGAACGGG GCCGAGGTGG 
2 641 AGATAGATAC GGAGGATAGG GTGGCCTTTA GATGTAGCAT GATAAATATG TGGCCGGGGG 
2701 TGCTTGGCAT GGACGGGGTG GTTATTATGA ATGTAAGGTT TACTGGCCCC AATTTTAGCG 
2761 GTACGGTTTT CCTGGCCAAT ACCAACCTTA TCCTACACGG TGTAAGCTTC TATGGGTTTA 
2821 ACAATACCTG TGTGGAAGCC TGGACCGATG TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 
2881 GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 
2941 AAAGGTGTAC CTTGGGTATC CTGTCTGAGG GTAACTCCAG GGTGCGCCAC AATGTGGCCT 
3001 CCGACTGTGG TTGCTTCATG CTAGTGAAAA GCGTGGCTGT GATTAAGCAT AACATGGTAT 
3061 GTGGCAACTG CGAGGACAGG GCCTCTCAGA TGCTGACCTG CTCGGACGGC AACTGTCACC 
3121 TGCTGAAGAC CATTCACGTA GCCAGCCACT CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 
3181 ACATACTGAC CCGCTGTTCC TTGCATTTGG GTAACAGGAG GGGGGTGTTC CTACCTTACC 
3241 AATGCAATTt GAGTCACACT AAGATATTGC TTGAGCCCGA GAGCATGTCC AAGGTGAACC' 
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I3^n^^l GTTTGACATG ACCATGAAGA TCTGGAAGGT GCTGAGGTAC GATGAGACCC 
nil ^rl^^lt CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG OCTgSS 
Itl^ CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGcSact 

Itl] CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGcSISS 

lln^ ^^n^ TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGC^?2 
lll^ r3^n?.?^l QAOCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT SSacJSg? 

3661 gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt SStptto?? 
IIV: f^fJSn'^'^^ cgcaaactct actaccttga cctacgaS? cgSSSSS SS?SSS 

3781 agactgcagc CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG AT^SrS^ 

3841 actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttStc? S^SgcJaJ? 

i^nrj!''' '^'^^CrrTTG GCACAATTCG ATTCT??aA? cSSScS SSgtcSJS 
^^ff^^^"^ GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTtS TCCCcSS 
Tol] ^JSSS^'' AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTtSS5? SS^SJ 

J?!^ ttatttaggg gttttgcgcg cgcggtaggc ccgggaccS cgctct?St 

4141 cgttgagggt cctgtgtatt ttttccagga cgtggtaaag gtgStcSg atSSSSJ 

till J^^!^''^ AAGCCCGTCT CTGGGGTG6A GGTAGCACCA CTGCaS? JJS^JSg 
tltl ^^^^^-^ATC CAGTCGTAGC AGGAGCGCT6 GGCGTGGTGC SSSStcJ 

SII^^^ CAAGCTGATT GCCAGGGGCA GGCCCTTGGT GTAAGTGtS ACA^SJ 
^^^^^ TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTtSSJ 
ttn\ ^^"^ CCCAGCCATA TCCCTCCGGG GATTCATGTT GTgSgS?? JJSSJSg 
4501 TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAA^TG^G ^SSJaJS 
lit] Sn^^^^^^ CTTGTGACCT CCAAGATl^T CCATOCAT^? GT^ISS S^S^S 
ttl^ GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAgSctJ?? 

t^l. ^.^^^''^'^'''^ ATCGTCATAG GCCATTTTTA CAAAGCGCGG GCGGAGGGTG CCAgSSg 
lit. n^^"^"^ TCCATCCGGC CCAGGGGCGT AGTTACCCTC ACAGATTTCC StoSSg 

IIV; S^^I^ agatgggggg atcatgtcta cctgcggggc gatSaSS 

tit, gatcagctgg gaagaaagca ggttcctgag cagctgSaJ J5S?SSc 

4921 CGGTGGGCCC GTAAATCACA CCTATTACCG GGTGCAACTG GTAGTTAA6A GAGCTOtJS 
till CCTGAGCAGG GGGGCCACTT CGTTAAGCaS GTCcSS^S ^SSJSJS 

5041 CCCTGACCAA ATCCGCCAGA AGGCGCTCGC CGCCCAGCGA TAGCAGTTCT TGCAAG^Ir 
llli ^^^^T*^ CAACGGTTTG AGACCGTCCG CCGTAGGCAT GCTTtJSS SSJSSt? 

5161 gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga ?cSgJa5JJ 

till I'^^T' C^«GG1^ GGCGGCTTTC GCTGTACGGC SaotSSJ S^SgtcJaG 
nil nn^?.^"" GTCATGTCTT TCCACGGGCG CAGGGTCCTC GTCAGCgSg TCTgStc^? 
tlnl TGCGCTCCGG GCTGCGCGCT GGCCAGGGTG CGCTTGAGGC TGctSJSS 

nil ^^r^^"^"^ CGCTGCCGGT CTTCGCCCTG CGCGTCGGCC AGGTAGCATT S^aSSJ 
litl nl?.^'^'"^'''''' AGCCCCTCCG CGGCGTGGCC CTTGGCGCGC AGCTTGcJcJ SSaGgSSJ 

nil SS^^^o''''^ gggcagtgca gacttttgag ggcgtagagc t?gggcS gSa5?ga 

nil J^f^S^^ TAGGCATCCG CGCCGCAGGC CCCGCAGACG GTCTCGCaS C^gJSS 
Itil GGCCGTTCGG GGTCAAAAAC CAGGTTTCCC CCATGCTtS TCATgJgSJ 

5701 CTTACCTCTG GTTTCCATGA GCCGGTGTCC ACGCTCGGTG ACGAAAAGGC TCTCCgS 
5761 CCCGTATACA GACTTGAGAG GCCTGTCCTC GAGCGGTGTT CCGCGgJ^CT StcSS 
nil ^rn^r.^t'^ CACTCTGAGA CAAAGGCTCG CGTCCaS SSSSSS S^Sg 
5881 GGAGGGGTAG CGGTCGTTGT CCACTAGGGG GTCCACTCGC TCCAGGGTCT GAAGAcirA? 
mi TCGGCATCAA GGAAGGTGAT TGGTTTGTAG GTOTA^^I ^^AcJSJJ 

tnll r^^"^""^ GGGGGGCTAT AAAAGGGGGT GGGGGCGCGT TCGTCCTCA? J^tcJ^S^ 
6061 ATCGCTGTCT GCGAGGGCCA GCTGTTGGGG TGAGTACTCC CTCTGaSaG JcSS^ar 
till S^f^S^"'^ AGATTGTCAG TTTCCAAAAA CGAGGaS SgS^A JJSSSS^ 
till ff^""^^^ TTGAGGGTGG CCGCATCCAT CTGGTCAGAA AAGACAATCT SSSStc 
tlnl GCAAACGACC CGTAGAGGGC GTTGGACAGC AACTTGgSI SSSSS 

till S^^"^ ttgtcgcgat cggcgcgctc cttggccgcg atgtt?ag?t g?SctSS? 

till ^^^^'^'^^^ CACCGCCATT CGGGAAAGAC GGTGGTGCGC TCGTCGGgS cSgctS 

till S^'^'^''^ cggttgtgca gggtgacaag gtcaacgctg gtgScta?S cSSSag 
6481 gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gJSSStc 

6541 TAGCTGCGTC TCGTCCGGGG GGTCTCCGTC CACGGTAAaS kct^^l SSS^S 
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6601 GTCGAAGTAG TCTATCTTGC ATCCTTGCAA GTCTAGCGCC TGCTGCCATG CiSCGGGCGGC' 
6661 AAGCGCGCGC TCGTATGGGT TGAGTGGGGG ACCCCATGGC ATGGGGTGGG TGAGCGCGGA 
6721 GGCGTACATG CCGCAAATGT CGTAAACGTA GAGGGGCTCT CTGAGTATTC CAAGATATGT 
6781 AGGGTAGCAT CTTCCACCGC GGATGCTGGC GCGCACGTAA TCGTATAGTT CGTGCGAGGG 
6841 AC3CGAGGAGG TCGGGACCGA GGTTGCTACG GGCGGGCTGC TCTGCTCGGA AGACTATCTG 
6901 CCTGAAGATG GCATGTGAGT TGGATGATAT GGTTGGACGC TGGAAGACGT TGAAGCTGGC 
6961 GTCTGTGAGA CCTACCGCGT CACGCACGAA GGAGGCGTAG GAGTCGCGCA GCTTGTTGAC 
7021 CAGCTCGGCG GTGACCTGCA CGTCTAGGGC GCAGTAGTCC AGGGTTTCCT TGATGATGTC 
7081 ATACTTATCC TGTCCCTTTT TTTTCCACAG CTCGCGGTTG AGGACAAACT CTTCGCGGTC 
7141 TTTCCAGTAC TCTTGGATCG GAAACCCGTC GGCCTCCGAA CGGTAAGAGC CTAGCATGTA 
7201 GAACTGGTTG ACGGCCTGGT AGGCGCAGCA TCCCTTTTCT ACiSGGTAGCG CGTATGCCTG 
7261 CGCGGCCTTC CGGAGCGAGG TGTGGGTGAG CGCAT^GGTG TCCCTGACCA TGACTTTGAG 
7321 GTACTGGTAT TTGAAGTCAG TGTCGTCGCA TCCGCCCTGC TCCCAGAGCA AAAAGTCCGT 
7381 GCGCTTTTTG GAACGCGGAT TTGGCAGGGC GAAGGTGACA TCGTTGAAGA GTATCTTTCC 
7441 CGCGCGAGGC ATAAAGTTGC GTGTGATGCG GAAGGGTCCC GGCACCTCGG AACGGTTGTT 
7501 AATTACCTGG GCGGCGAGCA CGATCTCGTC AAAGCCGTTG ATGTTGTGGC CCACAATGTA 
7561 AAGTTCCAAG AAGCGCGGGA TGCCCTTGAT GGAAGGCAAT TTTTTAAGTT CCTCGTAGGT 
7621 GAGCTCTTCA GGGGAGCTGA GCCCGTGCTC TGAAAGGGCC CAGTCTGCAA GATGAGGGTT 
7681 GGAAGCGACG AATGAGCTCC ACAGGTCACG GGCCATTAGC ATTTGCAGGT GGTCGCGAAA 
.7741 GGTCCTAAAC TGGCGACCTA TGGCCATTTT TTCTGGGGTG ATGCAGTAGA AGGTAAGCGG 
7801 GTCTTGTTCC CAGCGGTCCC ATCCAAGGTT CGCGGCTAGG TCTCGCGCGG CAGTCACTAG 
7861 AGGCTCATCT CCGCCGAACT TCATGACCAG CATGAAGGGC ACGAGCTGCT TCCCAAAGGC 
7921 CCCCATCCAA GTATAGGTCT CTACATCGTA GGTGACAAAG AGACGCTCGG TGCGAGGATG 
7981 CGAGCCGATC GGGAAGAACT GGATCTCCCG CCACCAATTG GAGGAGTGGC TATTGATGTG 
8041 GTGAAAGTAG AAGTCCCTGC GACGGGCCGA ACACTCGTGC TGGCTTTTGT AAAAACGTGC 
8101 GCAGTACTGG CAGCGGTGCA CGGGCTGTAC ATCCTGCACG AGGTTGACCT GACGACCGCG 
8161 CACAAGGAAG CAGAGTGGGA ATTTGAGCCC CTCGCCTGGC GGGTTTGGCT GGTGGTCTTC 
8221 TACTTCGGCT GCTTGTCCTT GACCGTCTGG CTGCTCGAGG GGAGTTACGG TGGATCGGAC 
8281 CACCACGCCG CGCGAGCCCA AAGTCCAGAT GTCCGCGCGC GGCGGTCGGA GCTTGATGAC 
8341 AACATCGCGC AGATGGGAGC TGTCCATGGT CTGGAGCTCC CGCGGC6TCA GGTCAGGCGG 
8401 GAGCTCCTGC AGGTTTACCT CGCATAGACG GGTCAGGGCG CGGGCTAGAT CCAGGTGATA 
8461 CCTAATTTCC AGGGGCTGGT TGGTGGCGGC GTCGATGGCT TGCAAGAGGC CGCATCCCCG 
8521 CGGCGCGACT ACGGTACCGC GCGGCGGGCG GTGGGCCGCG GGGGTOTCCT TGGATGATGC 
8581 ATCTAAAAGC GGTGACGCGG GCGAGCCCCC GGAGGTAGGG GGGGCTCCGG ACCCGCCGGG 
8641 AGAGGGGGCA GGGGCACGTC GGCGCCGCGC GCGGGCAGGA GCTGGTGCTG CGCGCGTAGG 
8701 TTGCTGGCGA ACGCGACGAC GCGGCGGTTG ATCTCCTGAA TCTGGCGCCT CTGCGTGAAG 
8761 ACGACGGGCC CGGTGAGCTT GAGCCTGAAA GAGAGTTCGA CAGAATCAAT TTCGGTGTCG 
8821 TTGACGGCGG CCTGGCGCAA AATCTCCTGC ACGTCTCCTG AGTTGTCTTG ATAGGCGATC 
8881 TCGGCCATGA ACTGCTCGAT CTCTTCCTCC TGGAGATCTC CGCGTCCGGC TCGCTCCAGG 
8941 GTGGCGGCGA GGTCGTTGGA AATGCGGGCC ATGAGCTGCG AGAAGGCGTT GAGGCCTCCC 
9001 TCGTTCCAGA CGCGGCTGTA GACCACGCCC CCTTCGGCAT CGCGGGCGCG CATGACCACC 
9061 TGCGCGAGAT TGAGCTCCAC GTGCCGGGCG AAGACGGCGT AGTTTCGCAG GCGCTGAAAG 
9121 AGGTAGTTGA GGGTGGTGGC GGTGTGTTCT GCCACGAAGA AGTACATAAC CCAGCGTCGC 
9181 AACGTGGATT CGTTGATATC CCCCAAGGCC TCAAGGCGCT CCATGGCCTC GTAGAAGTCC 
9241 ACGGCGAAGT TGAAAAACTG GGAGTTGCGC GCCGACACGG TTAACTCCTC CTCCAGAAGA 
9301 CGGATGAGCT CGGCGACAGT GTCGCGCACC TCGCGCTCAA AGGCTACAGG GGCCTCTTCT 
93 61 TCTTCTTCAA TCTCCTCTTC CATAAGGGCC TCCCCTTCTT CTTCTTCTGG CGGCGGTGGG 
9421 GGAGGGGGGA CACGGCGGCG ACGACGGCGC ACCGGGAGGC GGTCGACAAA GCGCTCGATC 
9481 ATCTCCCCGC GGCGACGGCG CATGGTCTCG GTGACGGCGC GGCCGTTCTC GC6GGGGCGC 
9541 AGTTGGAAGA CGCCGCCCGT CATGTCCCGG TTATGGGTTG GCGGGGGGCT GCCATGCGGC 
9601 AGGGATACGG CGCTAACGAT GCATCTCAAC AATTGTTGTG TAGGTACTCC GCCGCCGAGG 
9661 GACCTGAGCG AGTCCGCATC GACCGGATCG GAAAACCTCT CGAGAAAGGC GTCTAACCAG 
9721 TCACAGTCGC AAGGTAGGCT GAGCACCGTG GCGGGCGGCA GCGGGCGGCG GTCGGGGTTG 
9781 TTTCTGGCGG AGGTGCTGCT GATGATGTAA TTAAAGTAGG CGGTCTTGAG ACGGCGGATG 
9841 GTCGACAGAA GCACCATGTC CTTGGGTCCG GCCTGCTGAA TGCGCAGGCG GTCGGCCATG 
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9901 
9961 
10021 
10081 
10141 
10201 
10261 
10321 
10381 
10441 
10501 
10561 
10621 
10681 
10741 
10801 
10861 
10921 
10981 
11041 
11101 
11161 
11221 
11281 
11341 
11401 
11461 
11521 
11581 
11641 
11701 
11761 
11821 
11881 
11941 
12001 
12061 
12121 
12181 
12241 
12301 
12361 
12421 
12481 
12541 
12601 
12661 
12721 
12781 
12841 
12901 
12961 
13021 
13081 
13141 



CCCCAGGCTT 
ACCGGCACTT 
GCGGCGGAGT 
CTCATCGGCT 
ACCT6CGTGA 
TTGATGGTGT 
GAGAGCTCGG 
GTCCGCACCA 
CAGCGTAGGG 
TAGATGTACC 
CGGACGCGGT 
CCGGTCAGGC 
GGCACTCTTC 
TCGAGCCCCG 
CAGGTGTGCG 
GCTGCTGCGC 
GCGAAAGCAT 
GCGGGACCCC 
CCGTCATGCA 
TTTTCCCAGA 
CAAGAGCAGC 
ACATCCGCGG 
CACTACCTGG 
CGGTACCCAA 
CTGTTTCGCG 
GGGCGCGAGC 
CCCGACGCGC 
ACCGCATACG 
GTGCGTACGC 
GTAAGCGCGC 
GTGCAGCACA 
GAGGGCCGCT 
AGCTTGAGCC 
TTTTACGCCC 
GAGGGGTTCT 
TATCGCAACG 
CGCGAGCTGA 
GCCGAGTCCT 
GAGGCAGCTG 
GGCGTGGAGG 
GTGATGTTTC 
AGAGCCAGCC 
TGTCGCTGAC 
CCGCAATTCT 
CGATCGTAAA 
ACGACGCGCT 
ACCGGCTGGT 
GCAACCTGGG 
CGCGGGGACA 
CACCGCAAAG 
GCCTGCAGAC 
GGGCTCCCAC 
TGCTGCTGCT 
GTCACTTGCT 
TCCAGGAGAT 



CGTTTTGACA 
CTTCTTCTCC 
TTGGCCGTAG 
GAAGCAGGGC 
GGGTAGACTG 
AAGTGCAGTT 
TGTACCTGAG 
GGTACTGGTA 
TGGCCGGGGC 
TGGACATCCA 
TCCAGATGTT 
GCGCGCAATC 
CGTGGTCTGG 
TATCCGGCCG 
ACGTCAGACA 
TAGCTTTTTT 
TAAGTGGCTC 
CGGTTCGAGT 
AGACCCCGCT 
TGCATCCGGT 
GGCAGACATG 
TTGACGCGGC 
ACTTGGAGGA 
GGGTGCAGCT 
ACCGCGAGGG 
TGCGGCATGG 
GAACCGGGAT 
AGCAGACGGT 
TTGTGGCGCG 
TGGAGCAAAA 
GCAGGGACAA 
GGCTGCTCGA 
TGGCTGACAA 
GCAAGATATA 
ACATGCGCAT 
AGCGCATCCA 
TGCACAGCCT 
ACTTTGACGC 
GGGCCGGACC 
AATATGACGA 
TGATCAGATG 
GTCCGGCCTT 
TGCGCGCAAT 
GGAAGCGGTG 
CGCGCTGGCC 
GCTTCAGCGC 
GGGGGATGTG 
CTCCATGGTT 
GGAGGACTAC 
TGAGGTGTAC 
CGTAAACCTG 
AGGCGACCGC 
AATAGCGCCC 
GACACTGTAC 
TACAAGTGTC 



TCGGCGCAGG 
TTCCTCTTGT 
GTGGCGCCCT 
TAGGTCGGCG 
GAAGTCATCC 
GGCCATAACG 
ACGCGAGTAA 
TCCCACCAAA 
TCCGGGGGCG 
GGTGATGCCG 
GCGCAGCGGC 
GTTGACGCTC 
TGGATAAATT 
TCCGCCGTGA 
ACGGGGGAGT 
GGCCACTGGC 
GCTCCCTGTA 
CTCGGACCGG 
TGCAAATTCC 
GCTGCGGCAG 
CAGGGCACCC 
AGCAGATGGT 
GGGCGAGGGC 
GAAGCGTGAT 
AGAGGAGCCC 
CCTGAATCGC 
TAGTCCCGCG 
GAACCAGGAG 
CGAGGAGGTG 
CCCAAATAGC 
CGAGGCATTC 
TTTGATAAAC 
GGTGGCCGCC 
CCATACCCCT 
GGCGCTGAAG 
CAAGGCCGTG 
GCAAAGGGCC 
GGGCGCTGAC 
TGGGCTGGCG 
GGACGATGAG 
ATGCAAGACG 
AACTCCACGG 
CCTGACGCGT 
GTCCCGGCGC 
GAAAACAGGG 
GTGGCTCGTT 
CGCGAGGCCG 
GCACTAAACG 
ACCAACTTTG 
CAGTCTGGGC 
AGCCAGGCTT 
GCGACCGTGT 
TTCACGGACA 
CGCGAGGCCA 
AGCCGCGCGC 



TCTTTGTAGT 
CCTGCATCTC 
CTTCCTCCCA 
ACAACGCGCT 
ATGTCCACAA 
GACCAGTTAA 
GCCCTCGAGT 
AAGTGCGGCG 
AGATCTTCCA 
GCGGCGGTGG 
AAAAAGTGCT 
TAGACCGTGC 
CGCAAGGGTA 
TCCATGCGGT 
GCTCCTTTTG 
CGCGCGCAGC 
GCCGGAGGGT 
CCGGACTGCG 
TCCGGAAACA 
ATGCGCCCCC 
TCCCCTCCTC 
GATTACGAAC 
CTGGCGCGGC 
ACGCGTGAGG 
GAGGAGATGC 
GAGCGGTTGC 
CGCGCACACG 
ATTAACTTTC 
GCTATAGGAC 
AAGCCGCTCA 
AGGGATGCGC 
ATCCTGCAGA 
ATCAACTATT 
TACGTTCCCA 
GTGCTTACCT 
AGCGTGAGCC 
CTGGCTGGCA 
CTGCGCTGGG 
GTGGCACCCG 
TACGAGCCAG 
CAACGGACCC 
ACGACTGGCG 
TCCGGCAGCA 
GCGCAAACCC 
CCATCCGGCC 
ACAACAGCGG 
TGGCGCAGCG 
CCTTCCTGAG 
TGAGCGCACT 
CAGACTATTT 
TCAAAAACTT 
CTAGCTTGCT 
GTGGCAGCGT 
TAGGTCAGGC 
TGGGGCAGGA 



AGTCTTGCAT 
TTGCATCTAT 
TGCGTGTGAC 
CGGCTAATAT 
AGCGGTGGTA 
CGGTCTGGTG 
CAAATACGTA 
GCGGCTGGCG 
ACATAAGGCG 
TGGAGGCGCG 
CCATGGTCGG 
AAAAGGAGAG 
TCATGGCGGA 
TACCGCCCGC 
GCTTCCTTCC 
GTAAGCGGTT 
TATTTTCCAA 
GCGAAC66G6 
GGGACGAGCC 
CTCCTCAGCA 
CTACCGCGTC 
CCCCGCGGCG 
TAGGAGCGCC 
CGTACGTGCC 
GGGATCGAAA 
TGCGCGAGGA 
TGGCGGCCGC 
AAAAAAGCTT 
TGATGCATCT 
TGGCGCAGCT 
TGCTAAACAT 
GCATAGTGGT 
CCATGCTTAG 
TAGACAAGGA 
TGAGCGACGA 
GGCGGCGCGA 
CGGGCAGCGG 
CCCCAAGCCG 
CGCGCGCTGG 
AGGACGGCGA 
GGCGGTGCGG 
CCAGGTCATG 
GCCGCAGGCC 
CACGCACGAG 
CGACGAGGCC 
CAACGTGCAG 
TGAGCGCGCG 
TACACAGCCC 
GCGGCTAATG 
TTTCCAGACC 
GCAGGGGCTG 
GACGCCCAAC 
GTCCCGGGAC 
GCATGTGGAC 
GGACACGGGC 



GAGCCTTTCT 
CGCTGCGGCG 
CCCGAAGCCC 
GGCCTGCTGC 
TGCGCCCGTG 
ACCCGGCTGC 
GTCGTTGCAA 
GTAGAGGGGC 
ATGATATCCG 
CGGAAAGTCG 
GACGCTCTGG 
CCTGTAAGCG 
CGACCGGGGT 
GTGTCGAACC 
AGGCGCGGCG 
AGGCTGGAAA 
GGGTTGAGTC 
GTTTGCCTCC 
CCTTTTTTGC 
GCGGCAAGAG 
AGGAGGGGCG 
CCGGGCCCGG 
CTCTCCTGAG 
GCGGCAGAAC 
GTTCCACGCA 
GGACTTTGAG 
CGACCTGGTA 
TAACAACCAC 
GTGGGACTTT 
GTTCCTTATA 
AGTAGAGCCC 
GCAGGAGCGC 
CCTGGGCAAG 
GGTAAAGATC 
CCTGGGCGTT 
GCTCAGCGAC 
CGATAGAGAG 
ACGCGCCCTG 
CAACGTCGGC 
GTACTAAGCG 
GCGGCGCTGC 
GACCGCATCA 
AACCGGCTCT 
AAGGTGCTGG 
GGCCTGGTCT 
ACCAACCTGG 
CAGCAGCAGG 
GCCAACGTGC 
GTGACTGAGA . 
AGTAGACAAG 
TGGGGGGTGC 
TCGCGCCTGT 
ACATACCTAG 
GAGCATACTT 
AGCCTGGAGG 
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13201 CAACCCTAAA CTACCTGCTG ACCAACCGGC GGCAGAAGAT CCCCTCGTTG CACAGTTTAA 
13261 ACAGCGAGGA GGAGCGCATT TTGCGCTACG TGCAGCAGAG CGTGAGCCTT AACCTGATGC 
13321 GCGACGGGGT AACGCCCAGC GTGGCGCTGG ACATGACCGC GCGCAACATG GAACCGGGCA 
13381 TGTATGCCTC AAACCGGCCG TTTATCAACC GCCTAATGGA CTACTTGCAT CGCGCGGCCG 
13441 CCGTGAACCC CGAGTATTTC ACCAATGCCA TCTTGAACCC GCACTGGCTA CCGCCCCCTG; 
13501 GTTTCTACAC CGGGGGATTC GAGGTGCCCG AGGGTAACGA TGGATTCCTC TGGGACGACA 
13561 TAGACGAC AG CGTGTTTTCC CCGCAACCGC AGACCCTGCT AGAGTTGCAA CAGCGCGAGC 
13621 AGGCAGAGGC GGCGCTGCGA AAGGAAAGCT TCCGCAGGCC AAGCAGCTTG TCCGATCTAG 
13681 GCGCTGCGGC CCCGCGGTCA GATGCTAGTA GCCCATTTCC AAGCTTGATA GGGTCTCTTA; 
13741 CCAGCACTCG CACCACCCGC CCGCGCCTGC TGGGCGAGGA GGAGTACCTA AACAACTCGC 
13801 TGCTGCAGCC GCAGCGCGAA AAAAACCTGC CTCCGGCATT TCCCAACAAC GGGATAGAGA 
13 861 GCCTAGTGGA CAAGATGAGT AGATGGAAGA CGTACGCGCA GGAGCACAGG GACGTGCCAG 
13 921 GCCCGCGCCC GCCCACCCGT CGTCAAAGGC ACGACCGTCA GCGGGGTCTG GTGTGGGAGG 
13981 ACGATGACTC GGCAGACGAC AGCAGCGTCC TGGATTTGGG AGGGAGTGGC AACCCGTTTG. 
14041 CGCACCTTCG CCCCAGGCTG GGGAGAATGT TTTAAAAAAA AAAAAGCATG ATGCAAAATA 
14101 AAAAACTCAC CAAGGCCATG GCACCGAGCG TTGGTTTTCT TGTATTCCCC TTAGTATGCG 
14161 GCGCGCGGCG ATGTATGAGG AAGGTCCTCC TCCCTCCTAC GAGAGTGTGG TGAGCGCGGC 
14221 GCCAGTGGCG GCGGCGCTGG GTTCTCCCTT CGATGCTCCC CTGGACCCGC CGTTTGTGCC 
14281 TCCGCGGTAC CTGCGGCCTA CCGGGGGGAG AAACAGCATC CGrTACTCTG AGTTGGCACC 
14341 CCTATTCGAC ACCACCCGTG TGTACCTGGT GGACAACAAG TCAACGGATG TGGCATCCCT 
14401 GAACTACCAG AACGACCACA GCAACTTTCT GACCACGGTC ATTCAAAACA ATGACTACAG 
14461 CCGGGGGGAG GCAAGCACAC AGACCATCAA TCTTGACGAC CGGTCGCACT GGGGCGGCGA 
14521 CCTGAAAACC ATCCTGCATA CCAACATGCC AAATGTGAAC GAGTTCATGT TTACCAATAA 
14581 GTTTAAGGCG CGGGTGATGG TGTCGCGCTT GCCTACTAAG GACAATCAGG TGGAGCTGAA 
14641 ATACGAGTGG GTGGAGTTCA CGCTGCCCGA GGGCAACTAC TCCGAGACCA TGACCATAGA 
14701 CCTTATGAAC AACGCGATCG TGGAGCACTA CTTGAAAGTG GGCAGACAGA ACGGGGTTCT 
14761 GGAAAGCGAC ATCGGGGTAA AGTTTGACAC CCGCAACTTC AGACTGGGGT TTGACCCCGT 
14821 CACTGGTCTT GTCATGCCTG GGGTATATAC AAACGAAGCC TTCCATCCAG ACATCATTTT 
14881 GCTGCCAGGA TGCGGGGTGG ACTTCACCCA CAGCCGCCTG AGCAACTTGT TGGGCATCCG 
14941 CAAGCGGCAA CCCTTCCAGG AGGGCTTTAG GATCACCTAC GATGATCTGG AGGGTGGTAA 
15001 CATTCCCGCA CTGTTGGATG TGGACGCCTA CCAGGCGAGC TTGAAAGATG ACACCGAACA 
15061 GGGCGGGGGT GGCGCAGGCG GCAGCAACAG CAGTGGCAGC GGCGCGGAAG AGAACTCCAA 
15121 CGCGGCAGCC GCGGCAATGC AGCCGGTGGA GGACATGAAC GATCATGCCA TTCGCGGCGA 
15181 CACCTTTGCC ACACGGGCTG AGGAGAAGCG CGCTGAGGCC GAAGCAGCGG CCGAAGCTGC 
15241 CGCCCCCGCT GCGCAACCCG AGGTCGAGAA GCCTCAGAAG AAACCGGTGA TCAAACCCCT 
15301 GACAGAGGAC AGCAAGAAAC GCAGTTACAA CCTAATAAGC AATGACAGCA CCTTCACCCA 
15361 GTACCGCAGC TGGTACCTTG CATACAACTA CGGCGACCCT CAGACCGGAA TCCGCTCATG 
15421 GACCCTGCTT TGCACTCCTG ACGTAACCTG CGGCTCGGAG CAGGTCTACT GGTCGTTGCC 
15481 AGACATGATG CAAGACCCCG TGACCTTCCG CTCCACGCGC CAGATCAGCA ACTTTCCGGT 
15541 GGTGGGCGCC GAGCTGTTGC CCGTGCACTG CAAGAGCTTC TACAACGACC AGGCCGTCTA 
15601 CTCCCAACTC ATCCGCCAGT TTACCTCTCT GACCCACGTG TTCAATCGCT TTCCCGAGAA 
15661 CCAGATTTTG GCGCGCCCGC CAGCCCCCAC CATCACCACC GTCAGTGAAA ACGTTCCTGC 
15721 TCTCACAGAT CACGGGACGC TACCGCTGCG CAACAGCATC GGAGGAGTCC AGCGAGTGAC 
15781 CATTACTGAC GCCAGACGCC GCACCTGCCC CTACGTTTAC AAGGCCCTGG GCATAGTCTC 
15841 GCCGCGCGTC CTATCGAGCC GCACTTTTTG AGCAAGCATG TCCATCCTTA TATCGCCCAG 
15901 CAATAACACA GGCTGGGGCC TGCGCTTCCC AAGCAAGATG TTTGGCGGGG CCAAGAAGCG 
15961 CTCCGACCAA CACCCAGTGC GCGTGCGCGG GCACTACCGC GCGCCCTGGG GCGCGCACAA 
16021 ACGCGGCCGC ACTGGGCGCA CCACCGTCGA TGACGCCATC GACGCGGTGG TGGAGGAGGC 
16081 GCGCAACTAC ACGCCCACGC CGCCACCAGT GTCCACAGTG GACGCGGCCA TTCAGACCGT 
16141 GGTGCGCGGA GCCCGGCGCT ATGCTAAAAT GAAGAGACGG CGGAGGCGCG TAGCACGTCG 
16201 CCACCGCCGC CGACCCGGCA CTGCCGCCCA ACGCGCGGCG GCGGCCCTGC TTAACCGCGC 
16261 ACGTCGCACC GGCCGACGGG CGGCCATGCG GGCCGCTCGA AGGCTGGCCG CGGGTATTGT 
16321 CACTGTGCCC CCCAGGTCCA GGCGACGAGC GGCCGCCGCA GCAGCCGCGG CCATTAGTGC 
16381 TATGACTCAG GGTCGCAGGG GCAACGTGTA TTGGGTGCGC GACTCGGTTA GCGGCCTGCG 
16441 CGTGCCCGTG CGCACCCGCC CCCCGCGCAA CTAGATTGCA AGAAAAAACT ACTTAGACTC 
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nl^^^IT^^ ATGTATCCAG CGGCGGCGGC GCGCAACGAA GCTATGTCCA AGCGCAAAAT 
16561 CAAAGAAGAG ATGCTCCAGG TCATCGCGCC GGAGATCTAT GGCCCCCCGA APa^finTT^r 
16621 GCAGGATTAC AAGCCCCGAA AGCTAAAGCG gScSg SS^gS^G A?^S^J 

16681 tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggc^? ggSJSJ^S? 

llsm SS^^o^"^ CGCGTAAAAC GTGTTrTCCG ACCCGGCAC? aSJ^StcJ SSJcS 
ltH^ JSSfS^^ ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG SSaSS 
lltt^ SSS^^ GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAgSgC aS^SS 

itlV: n^^^^ ccgctggacg agggcaaccc aacacctagc ctaaagcccg taaSJSa 
16981 gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc S5SS^^ 

Xlltl SiSSS? CCCACCGTGC AGCTGATGGT ACCCAAGCGC ?Sg^ 

17101 GGAAAAAATG ACCGTGGAAC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC cTATPaAnr^ 

J^iS! ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atSSSJ 

r^ll^ CAGTATTGCC ACCGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTC StcIS^^JS 
17281 GGCGGATGCC GCGGTCCAGG CGGTCGCTGC GGCCGCGTCC AAGaSJSS cSSSS^a 
17341 AACGGACCCG TGGATGTTTC GCGTTTCAGC CCCCCGGPOr r^nnnr-^^ CGGAGGTGCA 
Ivtll ^SSSf " AGCGCGCTAC .SSJJa JScSSS CC^C^ 0^^^^ 
lltll rSSSrfnJ GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCcSc G^SS^^cS 
,111^ o^o^"^^^ CGCCGCCGCC GTCGCCGTCG CCA6CCCGTG CTGGCCCCGA TTTCCGTC^G 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT IJSrr^^S 
CATCGTTTAA AAGCCGGTCT TTGTGGTTCT TGCAGaStG GCCCTCA?^? SSS^rS^Jr 
17701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAgS AgSSStcG CcSScI^S 
JZZ" CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCctSS S?^^!? 

1111": SSf^^^^ atcctgcccc tccttattcc actgatcSc gS^SS 5?SSSc? 
IiIai rf^^ll?.^ tccgtggcct tccaggcgca gagacactga ttaaaaacaa cttccatctc 

ilotl ri^V^ AATAAAAAGT CTGGACTCTC ACGCTCGCTT GGTCCTGTAA CTaSJSS 
18001 GAATGGAAGA CATCAACTTT GCGTCTCTGG CCCCGCGACA CGGCTCGCGC ccr^r^nn 

S^^^^ AGATATCGGC ACCAGCAATA TGAGCGGTGG ^SSJSSc tSSSJJS? 

I^T^''^^^ CATTAAAAAT TTCGGTTCCA CCGTTAAGaJ CTAtSS^ ISSSS 
"i!^ ^C^^CAGCAC AGGCCAGATG CTGAGGGATA AGTTGAAAGA GCAAAATTTC SJ^SJJ^^n 
lllt^ 'Saitt^'^ CCl^GCCTCT GGCATTAGCG GGGTGGtSS ^SSSJSI? SSSJS 
18301 AAAATAAGAT TAACAGTAAG CTTGATCCCC GCCCTCCCGT AGAGGAGCCT SSSSS 
^o^o^ TGGAGACAGT GTCTCCAGAG GGGCGTGGCG AAAAGCGTCC GCGCcJ^SS aSgSSS 

n'^lnf^^'' GCAAATAGAC GAGCCTCCCT CGTACGAGGA gSSSSJg cSg^JSS? 

lltli ^SSSSoS '^c^^ca'tcgcg cccatggcta ccggagtgct ggGccagcac a?Jc??SS^ 

lllV; GCCTCCCCCC GCCGACACCC AGCAGAAACC TGTGCTGCCA GgJccS??5 

18601 CCGTTGTTGT AACCCGTCCT AGCCGCGCGT CCCTGCGCCG CGCCGCCAGC GGTCcSSfS 
18661 CGTTGCGGCC CGTAGCCAGT GGCAACTGGC AAAGCACACT gScScaTC SS^^J^J 
Jfi^p} ^^"^ CCTGAAGCGC CGACGA-KSCT TCTCAaSaS T^CgJScG tI^I^^ 

^^^r^^"" CCATGTCGCC GCCAGAGGAG CTGCTGAGCC GCCGcSgcC JSSSSS 
^tltl ^T«=CTACC CCTTCGAT6A TGCCGCAGTG GTCTTACATG CACATCtcS ScSSr^ 
Jfi«J ^JS^^"^ CTGAGCCCCG GGCTGGTCCA GTTTGCCCGC G^^aJ^SS SSJ^S 
18961 CCTGAATAAC AAGTTTAGAA ACCCCACGGT GGCGCCTACG CACGACGTGA CC^^ 
lloll c'Slrrrrrl ^^^^^^^^^ GGTTCAICCC TGTOGACcS? SaSaSS ^ctSSS? 
?Q?!i n^^^^^^ TTCACCCTAG CTGTGGGTGA TAACCGTGTG CTGGACATGG CTTCCaSta 
^92^ r^n^^n^l"" CGCGGCGTGC TGGACAGGGG CCCTACTTTT AaSctAC? ^^ScaSS 
^IIV; ^n^^^"" CTGGCTCCCA AGGGTGCCCC AAATCCTTGC GAATGGGATG AAGCtcSS 

TGCTCTTGAA ATAAACCTAG AAGAAGAGGA CGATGACAAC GAAGACGAAG TaSSSSa 
illll Jff^o?P^o° CAAAAAACTC ACGTArriGG GCAGGCGCCT tXJSSSJa JSSJS 

J^^Po^f attcaaatag gtgtcgaagg tcaaacacct aaatatgccg ataaSSct 
tpJm ^^^S^"^ cctcaaatag gagaatctca gtggtacgaa actgaaSa atcmSSS 

lHl^ r^r.^t^'^ CTTAAAAAGA CTACCCCAAT GAAACCATGT TAcSttCA? JJ^SSJ^^ 

ilell aaggcattct tgtaaagcaa caaaatggJI agcS^g 

196«1 ^S^^^ atgcaatttt tctcaactac tgaggcgacc gcaggcaatc gtgataactt 

^l^^} GACTCCTAAA GTGGTATTGT ACAGTGAAGA TGTAGATATA GAAACCCCAG ACAP^tI^ 
19741 TTCTTACATG CCCACTATTA AGGAAGGTAA CTCACGaSS ^^^^ ^SS^JS 



FIG. 8F 



wo 03/03158S 



PCT/IJS02/32512 



63/92 



19801 GCCCAACAGG CCTAATTACA TTCCTTTTAG GGACAATTTT ATTGGTCtAA TGTATTACAA ■ 
19861 CAGCACGGGT AATATGGGTG TTCTGGCGGG CCAAGCATCG CAGTTGT^TG CTGTTGTAGA; 
19921 TTTGCAAGAC AGAAACACAG AGCTTTCATA CCAGCTTTTG CTTGATTCCA TTGGTGATAG 
19981 AACCAGGTAC TTTTCTATGT GGAATCAGGC TGTTGACAGC TATGATCCAG ATGTTAGAAT; 
20041 TATTGAAAAT CATGGAACTG AAGATGAACT TCCAAATTAC tgctttccac tgggaggtgt 5 
20101 GATTAATACA GAGACTCTTA CCAAGGTAAA ACCTAAAACA GGTCAGGAAA ATGGATGGGA 
20161 AAAAGATGCT ACAGAATTTT CAGATAAAAA TGAAATAAGA GTTGGAAATA ATTTTGCCAT 
20221 GGAAATCAAT CTAAATGCCA ACCTGTGGAG AAATTTCCTG TACTCCAACA TAGCGCTGTA 
2 0281 TTTGCCCGAC AAGCTAAAGT ACAGTCCTTC CAACGTAAAA ATTTCTGATA ACCCAAACAC 
2 0341 CTACGACTAC ATGAACAAGC GAGTGGTGGC TCCCGGGTTA GTGGACTGCT ACATTAACCT 
2 0401 TGGAGCACGC TGGTCCCTTG ACTATATGGA CAACGTCAAC CCATTTAACC ACCACCGCAA 
20461 TGCTGGCCTG CGCTACCGCT CAATGTTGCT GGGCAATGGT CGCTATGTGC CCTTCCACAT 
20521 CCAGGTGCCT CAGAAGTTCT TTGCCATTAA AAACCTCCTT CTCCTGCCGG GCTCATACAC 
20581 CTACGAGTGG AACTTCAGGA AGGATGTTAA CATGGTTCTG CAGAGGTCCC TAGGAAATGA 
2 0641 CCTAAGGGTT GACGGAGCCA GCATTAAGTT TGATAGCATT TGCCTTTACG CCACCTTCTT 
20701 CCCCATGGCC CACAACACCG CCTCCACGCT TGAGGCCATG CTTAGAAACG ACACCAACGA 
20761 CCAGTCCTTT AACGACTATC TCTCCGCCGC CAACATGCTC TACCCTATAC CCGCCAACGC 
20821 TACCAACGTG CCCATAtCCA TCCCCTCCCG CAACTGGGCG GCTTTCCGCG GCTGGGCCTT 
20881 CACGCGCCTT AAGACTAAGG AAACCCCATC ACTGGGCTCG GGCTACGACC CTTATTACAC 
20941 CTACTCTGGC TCTATACCCT ACCTAGATGG AACCTTTTAC CTCAACCACA CCTTTAAGAA 
21001 GGTGGCCATT ACCTTTGACT CTTCTGTCAG CTGGCCTGGC AATGACCGCC TGCTTACCCC 
21061 CAACGAGTTT GAAATTTUVGC GCTCAGTTGA CGGGGAGGGT TACAACGTTG CCCAGTGTAA 
21121 CATGACCAAA GACTGGTTCC TGGTACAAAT GCTAGCTAAC TACAACATTG GCTACCAGGG 
21181 CTTCTATATC CCAGAGAGCT ACAAGGACCG CATGTACTCC TTCTTTAGAA ACTTCCAGCG 
21241 CATGAGCCGT CAGGTGGTGG ATGATACTAA ATACAAGGAC TACCAACAGG TGGGCATCCT 
21301 ACACCAACAC AACAACTCTG GATTTGTTGG CtACCTTGCC CCCACCATGC GCGAAGGACA 
21361 GGCCTACCCT GCTAACTTCC CCTATCCGCT TATAGGCAAG ACCGCAGTTG ACAGCATTAC 
21421 CCAGAAAAAG TTTCTTTGCG ATCGCACCCT TTGGCGCATC CCATTCTCCA GTAACTTTAT 
21481 GTCCATGGGC GCACTCACAG ACCTGGGCCA AAACCTTCTC TACGCCAACT CCGCCCACGC 
21541 GCTAGACATG ACTTTTGAGG TGGATCCCAT GGACGAGCCC ACCCTTCTTT ATGTTTTGTT 
2i601 TGAAGTCTTT GACGTGGTCC GTGTGCACCG GCCGCACCGC GGCGTCATCG AAACCGTGTA 
21661 CCTGCGCACG CCCTTCTCGG CCGGCAACGC CACAACATAA AGAAGCAAGC AACATCAACA 
21721 ACAGCTGCCG CCATGGGCTC CAGTGAGCAG GAACTGAAAG CCATTGTCAA AGATCTTGGT 
21781 TGTGGGCCAT ATTTTTTGGG CACCTATGAC AAGCGCTTTC CAGGCTTTGT TTCTCCACAC 
2i841 AAGCTCGCCT GCGCCATAGT CAATACGGCC GGTCGCGAGA CTGGGGGCGT ACACTGGATG 
21901 GCCTTTGCCT GGAACCCGGA CTCAAAAACA TGCTACCTCT TTGAGCCCTT TGGCTTTTCT 
21961 GACCAGCGAC TCAAGCAGGT TTACCAGTTT GAGTACGAGT CACTCCTGCG CCGTAGCGCC 
22021 ATTGCTTCTT CCCCCGACCG CtGTATAACG CTGGAAAAGT CCACCCAAAG CGTACAGGGG 
22081 CCCAACTCGG CCGCCTGTGG ACTATTCTGC TGCATGTTTC TCCACGCCTT TGCCAACTGG 
22141 CCCCAAACTC CCATGGATCA CAACCCCACC ATGAACCTTA TTACCGGGGT ACCCAACTCC 
22201 ATGCTCAACA GTCCCCAGGT ACAGCCCACC CTGCGTCGCA ACCAGGAACA GCTCTACAGC 
22261 TTCCTGGAGC GCCACTCGCC CTACTTCCGC AGCCACAGTG CGCAGATTAG GAGCGCCAGT 
22321 TCTTTTTGTC ACTTGAAAAA CATGTAAAAA TAATGTACTA GAGACACTTT CAATAAAGGG 
22381 AAATGCTTTT ATTTGTACAC TCTCGGGTGA TTATTTACCC CCACCCTTGC CGTCTGCGCC 
22441 GTTTAAAAAT CAAAGGGGTT CTGCCGCGCA TCGCTATGCG CCACTGGCAG GGACACGTTG 
22501 CGATACTGGT GTTTAGTGCT CCACTTAAAC TCAGGCACAA CCATCCGCGG CAGCTCGGTG 
22561 TU^GTTTTCAC TCCACAGGCT GCGCACCATC ACCAACGCGT TTAGCAGGTC GGGCGCCGAT- 
22621 ATCTTGAAGT CGCAGTTGGG GCCTCCGCCC TGCGCGCGCG AGTTGCGATA CACAGGGTTG 
22681 CAGCACTGGA ACACTATCAG CGCCGGGTGG TGCACGCTGG CCAGCACGCT CTTGTCGGAG 
22741 ATCAGATCCG CGTCCAGGTC CTCCGCGTTG CTCAGGGCGA ACGGAGTCAA CTTTGGTAGG 
22801 TGCCTTCCCA AAAAGGGCGC GTGCCCAGGC TTTGAGTTGC ACTCGCACCG TAGTGGCATC 
22861 AAAAGGTGAC CGTGCCCGGT CTGGGCGTTA GGATACAGCG CCTGCATAAA AGCCTTGATC 
22921 TGCTTAAAAG CCACCTGAGC CTtTGCGCCT TCAGAGAAGA ACATGCCGCA AGACTTGCCG 
22981 GAAAACTGAT TGGCCGGACA GGCCGCGTCG TGCACGCAGC ACCTTGCGTC GGTGTTGGAG 
23041 ATCTGCACCA CATTTCGGCC CCACCGGTTC TTCACGATCT TGGCCTTGCT AGACTGCTCC 
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TTCAGCGCGC 
ATCATAATGC 
CACAACGCGC 
TACGCCTGCA 
TGCAACCCGC 
TGGTCAGGCA 
AGCGCGCGCG 
TTCATCACCG 
ATACCACGCG 
CCATGCTTGA 
CTTTCTTCCT 
GGGCGCTTCT 
GGGCTGGGTG 
ATACGCCGCC 
GACGACACGT 
GTTTCGCGCT 
ATGGAGTCAG 
TCCACCGATG 
GAGGAA6TGA 
GTACCAACAG 
GGGCGGGGGG 
CATCTGCAGC 
CTCGCCATAG 
CCCAAACGCC 
TTTGCCGTGC 
CTATCCTGCC 
GTCATACCTG 
GACGAGAAGC 
GGAGTGTTGG 
GAGGTCACCC 
ATGAGT6AGC 
CAAACAGAGG 
CGCGAGCCTG 
GTGGAGCTTG 
GAAACATTGC 
GTGGAGCTCT 
AACGTGCTTC 
TACTTATTTC 
GAGTGCAACC 
GCCTTCAACG 
CTTAAAACCC 
AGGAACTTTA 
GACTTTGTGC 
CTGCAGCTAG 
GGTCTACTGG 
AATTCGCAGC 
CCTGACGAAA 
TACCTTCGCA 
CAATCCCGCC 
GGCCAATTGC 
GTTTACTTGG 
TATCAGCAGC 
GCCGCCGCCA 
CGA6GAGGAG 
CGAAGAGGTG 



GCTGCCCGTT 
TTCCGTGTAG 
AGCCCGTGGG 
GGAATCGCCC 
GGTGCTCCTC 
GTAGTTTGAA 
CAGCCTCCAT 
TAATTTCACT 
CCACTGGGTC 
TTAGCACCGG 
CGCTGTCCAC 
TTTTCTTCTT 
TGCGCGGCAC 
TCATCCGCTT 
CCTCCATGGT 
GCTCCTCTTC 
TCGAGAAGAA 
CCGCCAACGC 
TTATCGAGCA 
AGGATAAAAA 
ACGAAAGGCA 
GCCAGTGCGC 
CGGATGTCAG 
AAGAAAACGG 
CAGAGGTGCT 
GTGCCAACCG 
ATATCGCCTC 
GCGCGGCAAA 
TGGAACTCGA 
ACTTTGCCTA 
TGATCGTGCG 
AGGGCCTACC 
CCGACTTGGA 
AGTGCATGCA 
ACTACACCTT 
GCAACCTGGT 
ATTCCACGCT 
TATGCTACAC 
TCAAGGAGCT 
AGCGCTCCGT 
TGCAACAGGG 
TCCTAGAGCG 
CCATTAAGTA 
CCAACTACCT 
AGTGTCACTG 
TGCTTAACGA 
AGTCCGCGGC 
AATTTGTACC 
CGCCAAATGC 
AAGCCATCAA 
ACCCCCAGTC 
AGCCGCGGGC 
CCCACGGACG 
GAGGACATGA 
TCAGACGAAA 



TTCGCTCGTC 
ACACTTAAGC 
CTCGTGATGC 
CATCATCGTC 
GTTCAGCCAG 
GTTCGCCTTT 
GCCCTTCTCC 
TTCCGCTTCG 
GTCTTCATTC 
TGGGTTGCTG 
GATTACCTCT 
GGGCGCAATG 
CAGCGCGTCT 
TTTTGGGGGC 
TGGGGGACGT 
CCGACTGGCC 
GGACAGCCTA 
GCCTACCACC 
GGACCCAGGT 
GCAAGACCAG 
TGGCGACTAC 
CATTATCTGC 
CCTTGCCTAC 
CACATGCGAG 
TGCCACCTAT 
CAGCCGAGCG 
GCTCAACGAA 
CGCTCTGCAA 
GGGTGACAAC 
CCCGGCACTT 
CCGTGCGCAG 
CGCAGTTGGC 
GGAGCGACGC 
GCGGTTCTTT 
TCGACAGGGC 
CTCCTACCTT 
CAAGGGCGAG 
CTGGCAGACG 
GCAGAAACTG 
GGCCGCGCAC 
TCTGCCAGAC 
CTCAGGAATC 
CCGCGAATGC 
TGCCTACCAC 
TCGCTGCAAC 
AAGTCAAATT 
TCCGGGGTTG 
TGAGGACTAC 
GGAGCTTACC 
CAAAGCCCGC 
CGGCGAGGAG 
CCTTGCTTCC 
AGGAGGAATA 
TGGAAGACTG 
CACCGTCACC 



ACATCCATTT 
TCGCCTTCGA 
TTGTAGGTCA 
ACAAAGGTCT 
GTCTTGCATA 
AGATCGTTAT 
CACGCAGACA 
CTGGGCTCTT 
AGCCGCCGCA 
AAACCCACCA 
GGTGATGGCG 
GCCAAATCCG 
TGTGATGAGT 
GCCCGGGGAG 
CGCGCCGCAC 
ATTTCCTTCT 
ACCGCCCCCT 
TTCCCCGTCG 
TTTGTAAGCG 
GACAACGCAG 
CTAGATGTGG 
GACGCGTTGC 
GAACGCCACC 
CCCAACCCGC 
CACATCTTTT 
GACAAGCAGC 
GTGCCAAAAA 
CAGGAAAACA 
GCGCGCCTAG 
AACCTACCCC 
CCCCTGGAGA 
GACGAGCAGC 
AAACTTU^TGA 
GCTGACCCGG 
TACGTACGCC 
GGAATTTTGC 
GCGCGCCGCG 
GCCATGGGCG 
CTAAAGCAAA 
CTGGCGGACA 
TTCACCAGTC 
TTGCCCGCCA 
CCTCCGCCGC 
TCTGACATAA 
CTATGCACCC 
ATCGGTACCT 
AAACTCACTC 
CACGCCCACG 
GCCTGCGTCA 
CAAGAGTTTC 
CTCAACCCAA 
CAGGATGGCA 
CTGGGACAGT 
GGAGAGCCTA 
CTCGGTCGCA 



CAATCACGTG 
TCTCAGCGCA 
CCTCTGCAAA 
TGTTGCTGGT 
CGGCCGCCAG 
CCACGTGGTA 
CGATCGGCAC 
CCTCTTCCTC 
CTGTGCGCTT 
TTTGTAGCGC 
GGCGGTCGGG 
CCGCCGAGGT 
CTTCCTCGTC 
GCGGCGGCGA 
CGCGTCC6CG 
CCTATAGGCA 
CTGAGTTCGC 
AGGCACCCCC 
AAGACGACGA 
AGGCAAACGA 
GAGACGACGT 
AAGAGCGCAG 
TATTCTCACC 
GCCTCAACTT 
TCCAAAACTG 
TGGCCTTGCG 
TCTTTGAGGG 
GCGAAAATGA 
CCGTACTAAA 
CCAAGGTCAT 
GGGATGCAAA 
TAGCGCGCTG 
TGGCCGCAQT 
AGATGCAGCG 
AGGCCTGCAA 
ACGAAAACCG 
ACTACGTCCG 
TTTGGCAGCA 
ACTTGAAGGA 
TCATTTTCCC 
AAAGCATGTT 
CCTGCTGTGC 
TTTGGGGCCA 
TGGAAGACGT 
CGCACCGCTC 
TTGAGCTGCA 
CGGGGCTGTG 
AGATTAGGTT 
TTACCCAGGG 
TGCTACGAAA 
TCCCCCCGCC 
CCCAAAAAGA 
CAGGCAGAGG 
GACGAGGAAG 
TTCCCCTCGC 



CTCCTTATTT 
GCGGTGCAGC 
CGACTGCAGG 
GAAGGTCAGC 
AGCTTCCACT 
CTTGTCCATC 
ACTCAGCGGG 
TTGCGTCCGC 
ACCTCCTTTG 
CACATCTTCT 
CTTGGGAGAA 
CGATGGCCGC 
CTCGGACTCG 
CGGGGACGGG 
CTCGGGGGTG 
GAAAAAGATC 
CACCACCGCC 
GCTTGAGGAG 
GGACCGCTCA 
GGAACAAGTC 
GCTGTTGAAG 
CGATGTGCCC 
GCGCGTACCC 
CTACCCCGTA 
CAAGATACCC 
GCAGGGCGCT 
TCTTGGACGC 
AAGTCACTCT 
ACGCAGCATC 
GAGCACAGTC 
TTTGCAAGAA 
GCTTCAAACG 
GCTCGTTACC . 
CAAGCTAGAG 
GATCTCCAAC 
CCTTGGGCAA 
CGACTGCGTT 
GTGCTTGGAG 
CCTATGGACG 
CGAACGCCTG 
GCAGAACTTT 
ACTTCCTAGC 
CTGCTACCTT 
GAGCGGTGAC 
CCTGGTTTGC 
GGGTCCCTCG 
GACGTCGGCT 
CTACGAAGAC 
CCACATTCTT 
GGGACGGGGG 
GCCGCAGCCC 
AGCTGCAGCT 
AGGTTTTGGA 
CTTCCGAGGT 
CGGCGCCCCA 
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26401 GAAATCGGCA ACCGGTTCCA GCATGGCTAC AACCTCCGCT CCTCAGGCGC CGCCGGCACT 
26461 GCCCGTTCGC CGACCCAACC GTAGATGGGA CACCACTGGA ACCAGGGCCG GTAAGTCCAA : 
26521 GCAGCCGCCG CCGTTAGCCC AAGAGCAACA ACAGCGCCAA GGCTACCGCT CATGGCGCGG 
2 6581 GCACAAGAAC GCCATAGTTG CTTGCTTGCA AGACTGTGGG GGCAACATCT CCTTCGCCCG 
2(5641 CCGCTTTCTT CTCTACCATC ACGGCGTGGC CTTCCCCCGT AACATCCTGC ATTACTACCG 
26701 TCATCTCTAC AGCCCATACT GCACCGGCGG CAGCGGCAGC GGCAGCAACA GCAGCGGCCA 
26761 CACAGAAGCA AAGGCGACCG GATAGCAAGA CTCTGACAAA GCCCAAGAAA TCCACAGCGG 
26821 CGGCAGCAGC AGGAGGAGGA GCGCTGCGTC TGGCGCCCAA CGAACCCGTA TCGACCCGCG 
26881 AGCTTAGAAA CAGGATTTTT CCCACTCTGT ATGCTATATT TCAACAGAGC AGGGGCCAAG 
2 6941 AAGAAGAGCT GAAAATAAAA AACAGGTCTC TGCGATCCCT CACCCGCAGC TGCCTGTATC 
27001 ACAAAAGCGA AGATCAGCTT CGGCGCACGC TGGAAGACGC GGAGGCTCTC TTCAGTAAAT 
27061 ACTGCGCGCT GACTCTTAAG GACTAGTTTC GCGCCCTTTC TC7^AATTT7U\ GCGCGAAAAC 
27121 TACGTCATCT GCAGCGGCCA CACCCGGCGC CAGCACCTGT CGTCAGCGCC ATTATGAGCA 
27181 AGGAAATTCC CACGCCCTAC ATGTGGAGTT ACCAGCCACA AATGGGACTT GCGGCTGGAG 
27241 CTGCCCAAGA CTACTCAACC CGAATAAACT ACATGAGCGC GGGACCCCAC ATGATATCCC 
27301 GGGTCAACGG AATCCGCGCC CACCGAAACC GAATTCTCTT GGAACAGGCG GCTATTACCA 
27361 CCACACCTCG TAATAACCTT AATCCCCGTA GTTGGCCCGC TGCCCTGGTG TACCAGGAAA 
27421 GTCCCGCTCC CACCACTGTG GTACTTCCCA GAGACGCCCA GGCCGAAGTT CAGATGACTA 
27481 ACTCAGGGGC GCAGCTTGCG GGCGGCTTTC GTCACAGGGT GCGGTCGCCC GGGCAGGGTA 
27541 TAACTCACCT GACAATCAGA GGGCGAGGTA TTCAGCTCAA CGACGAGTCG GTGAGCTCCT 
27601 CGCTTGGTCT CCGTCCGGAC GGGACATTTC AGATCGGCGG CGCCGGCCGT CCTTCATTCA 
27661 CGCCTCGTCA GGCAATCCTA ACTCTGCAGA CCTCGTCCTC TGAGCCGCGC TCTGGAGGCA 
27721 OTGGAACTCT GCAATTTATT GAGGAGTTTG TGCCATCGGT CTACTTTAAC CCCTTCTCGG 
27781 GACCTCCCGG CCACTATCCG GATCAATTTA TTCCTAACTT TGACGCGGTA AAGGACTCGG 
27841 CGGACGGCTA CGACTGAATG TTAAGTGGAG AGGCAGAGCA ACTGCGCCTG AAACACCTGG 
27901 TCCACTGTCG CCGCCACAAG TGCTTTGCCC GCGACTCCGG TGAGTTTTGC TACTTTGAAT 
27961 TGCCCGAGGA TCATATCGAG GGCCCGGCGC ACGGCGTCCG GCTTACCGCC GAGGGAGAGC 
28021 TTGCCCGTAG CCTGATTCGG GAGTTTACCC AGCGCCCCCT GCTAGTTGAG CGGGACAGGG 
28081 GACCCTGTGT TCTCACTGTG ATTTGCAACT GTCCTAACCT TGGATTACAT CAAGATCTTT 
28141 GTTGCCATCT CTGTGCTGAG TATAATAAAT ACAGAAATTA AAATATACTG GGGCTCCTAT 
28201 CGCCATCCTG TAAACGCCAC CGTCTTCACC CGCCCAAGCA AACCAAGGCG AACCTTACCt 
28261 iOGTACTTTTA ACATCTCTCC CTCTGTGATT TACAACAGTT TCAACCCAGA CGGAGTGAGT 
28321 CTACGAGAGA ACCTCTCCGA GCTCAGCTAC TCCATCAGAA AAAACACCAC CCTCCTTACC 
28381 TGCCGGGAAC GTACGAGTGC GTCACCGGCC GCTGCACCAC ACCTACCGCC TGACCGTAAA 
28441 CCAGACTTTT TCCGGACAGA CCTCAATAAC TCTGTTTACC AGAACAGGAG GTGAGCTTAG 
28501 AAAACCCTTA GGGTATTAGG CCAAAGGCGC AGCTACTGTG GGGTTTATGA ACAATTCAAG 
28561 CAACTCTACG GGCTATTCTA ATTCAGGTTT CTCTAGAATC GGGGTTGGGG TTATTCTCTG 
28621 TCtTGTGATT CTCTTTATTC TTATACTAAC GCTTCTCTGC CTAAGGCTCG CCGCCTGCTG 
28681 TGTGCACATT TGCATTTATT GTCAGCTTTT TAAACGCTGG GGTCGCCACC CAAGATGATT 
28741 AGGTACATAA TCCTAGGTTT ACTCACCCTT GCGTCAGCCC ACGGTACCAC CCAAAAGGTG 
28801 GATTTTAAGG AGCCAGCCTG TAATGTTACA TTCGCAGCTG AAGCTAATGA GTGCACCACT 
28861 CTTATAAAAT GCACCACAGA ACATGAAAAG CTGCTTATTC GCCACAAAAA CAAAATTGGC 
28921 AAGTATGCTG TTTATGCTAT TTGGCAGCCA GGTGACACTA CAGAGTATAA TGTTACAGTT 
28981 TTCCAGGGTA AAAGTCATAA AACTTTTATG TATACTTTTC CATTTTATGA AATGTGCGAC 
29041 ATTACCATGT ACATGAGCAA ACAGTATAAG TTGTGGGCCC CACAAAATTG TGTGGAAAAC 
29101 ACTGGCACTT TCTGCTGCAC TGCTATGCTA ATTACAGTGC TCGCTTTGGT CTGTACCCTA 
29161 CTCTATATTA AATACAAAAG CAGACGCAGC TTTATTGAGG AAAAGAAAAT GCCTTAATTT 
29221 ACTAAGTTAC AAAGCTAATG TCACCACTAA CTGCTTTACT CGCTGCTTGC AAAACAAATT 
29281 CAAAAAGTTA GCATTAT/^T TAGAATAGGA TTTAAACCCC CCGGTCATTT CCTGCTCAAT 
29341 ACCATTCCCC TGAACAATTG ACTCTATGTG GGATATGCTC CAGCGCTACA ACCTTGAAGT 
29401 CAGGCTTCCT GGATGTCAGC ATCTGACTTT GGCCAGCACC TGTCCCGCGG ATTTGTTCCA 
29461 GTCCAACTAC AGCGACCCAC CCTAACAGAG ATGACCAACA CAACCAACGC GGCCGCCGCT 
29521 ACCGGACTTA CATCTACCAC AAATACACCC CAAGTTTCTG CCTTTGTCAA TAACTGGGAT 
29581 AACTTGGGCA TGTGGTGGTT CTCCATAGCG CTTATGTTTG TATGCCTTAT TATTATGTGG 
29641 CTCATCTGCT GCCTAAAGCG CAAACGCGCC CGACCACCCA TCTATAGTCC CATCATTGTG 
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29701 CTACACCCAA ACAATGATGG AATCCATAGA TTGGACGGAC TGAAACACAT GTTCTTTTCT 
29761 CTTACAGTAT GATTAAATGA GACATGATTC CTCGAGTTTT TATATTACTG ACCCTTGTTG 
29821 CGCTTTTTTG TGCGTGCTCC ACATTGGCTG CGGTTTCTCA CATCGAAGTA GACTGCATTC 
29881 CAGCCTTCAC AGTCTATTTG CTTTACGGAT TTGTCACCCT CACGCTCATC TGCAGCCTCA 
29941 TCACTGTGGT CATCGCCTTT ATCCAGTGCA TTGACTGGGT CTGTGTGCGC TTTGCATATC 
30001 TCAGACACCA TCCCCAGTAC AGGGACAGGA CTATAGCTGA GCTTCTTAGA ATTCTTTAAT 
30061 TATGAAATTT ACTGTGACTT TTCTGCTGAT TATTTGCACC CTATCTGCGT TTTGTTCCCC 
30121 GACCTCCAAG CCTCAAAGAC ATATATCATG CAGATTCACT CGTATATGGA ATATTCCAAG 
30181 TTGCTACAAT GAAAAAAGCG ATCTTTCCGA AGCCTGGTTA TATGCAATCA TCTCTGTTAT 
30241 GGTGTTCTGC AGTACCATCT TAGCCCTAGC TATATATCCC TACCTTGACA TTGGCTGGAA 
303 01 ACGAATAGAT GCCATGAACC ACCCAACTTT CCCCGCGCCC GCTATGCTTC CACTGCAACA 
30361 AGTTGTTGCC GGCGGCTTTG TCCCAGCCAA TCAGCCTCGC CCCACTTCTC CCACCCCCAC 
30421 TGAAATCAGC TACTTTAATC TAACAGGAGG AGATGACTGA CACCCTAGAT CTAGAAATGG 
30481 ACGGAATTAT TACAGAGCAG CGCCTGCTAG AAAGACGCAG GGCAGCGGCC GAGCAACAGC 
30541 GCAT6AATCA AGAGCTCCAA GACATGGTTA ACTTGCACCA GTGCAAAAGG GGTATCTTTT 
30601 GTCTGGTAAA GCAGGCCAAA GTCACCTACG ACAGTAATAC CACCGGACAC CGCCTTAGCT 
30661 ACAAGTTGCC AACCAAGCGT CAGAAATTGG TGGTCATGGT GGGAGAAAAG CCCATTACCA 
30721 TAACTCAGCA CTCGGTAGAA ACCGAAGGCT GCATTCACTC ACCTTGTCAA GGACCTGAGG 
30781 ATCTCTGCAC CCTTATTAAG ACCCTGTGCG GTCTCAAAGA TCTTATTCCC TTTAACTAAT 
30841 AAAAAAAAAT AATAAAGCAT CACTTACTTA AAATCAGTTA GCAAATTTCT GTCCAGTTTA 
30901 TTCAGCAGCA CCTCCTTGCC CTCCTCCCAG CTCTGGTATT GCAGCTTCCT CCTGGCTGCA 
30961 AACTTTCTCC ACT^TCTAAA TGGAATGTCA GTTTCCTCCT GTTCCTGTCC ATCCGCACCC 
31021 ACTATCTTCA TGTTGTTGCA GATGAAGCGC GCAAGACCGT CTGAAGATAC CTTCAACCCC 
31081 GTGTATCCAT ATGACACGGA AACCGGTCCT CCAACTGTGC CTTTTCTTAC TCCTCCCTTT 
31141 GTATCCCCCA ATGGGTTTCA AGAGAGTCCC CCTGGGGTAC TCTCTTTGCG CCTATCCGAA 
31201 CCTCTAGTTA CCTCCAATGG CATGCTTGCG CTCAAAATGG GCAACGGCCT CTCTCTGGAC 
312 61 GAGGCCGGCA ACCTTACCTC CCAAAATGTA ACCACTGTGA GCCCACCTCT CAAAAAAACC 
31321 AAGTCAAACA TAAACCTGGA AATATCTGCA CCCCTCACAG TTACCTCAGA AGCCCTAACT 
31381 GTGGCTGCCG CCGCACCTCT AATGGTCGCG GGCAACACAC TCACCATGCA ATCACAGGCC 
31441 CCGCTAACCG TGCACGACTC CAAACTTAGC ATTGCCACCC AAGGACCCCT CACAGTGTCA 
31501 GAAGGAAAGC TAGCCCTGCA AACATCAGGC CCCCTCACCA CCACCGATAG CAGTACCCTT 
31561 ACTATCACTG CCTCACCCCC TCTAACTACT GCCACTGGTA GCTTGGGCAT TGACTTGAAA 
31621 GAGCCCATTT ATACACAAAA TGGAAAACTA GGACTAAAGT ACGGGGCTCC TTTGCATGTA 
31681 ACAGACGACC TAAACACTTT GACCGTAGCA ACTGGTCCAG GTGTGACTAT TAATAATACT 
31741 TCCTTGCAAA CTAAAGTTAC TGGAGCCTTG GGTTTTGATT CACAAGGCAA TATGCAACTT 
31801 AATGTAGCAG GAGGACTAAG GATTGATTCT CAAAACAGAC GCCTTATACT TGATGTTAGT 
31861 TATCCGTTTG ATGCTCAAAA CCAACTAAAT CTAAGACTAG GACAGGGCCC TCTTTTTATA 
31921 AACTCAGCCC ACAACTTGGA TATTAACTAC AACAAAGGCC TTTACTTGTT TACAGCTTCA 
31981 AACAATTCCA AAAAGCTTGA GGTTAACCTA AGCACTGCCA AGGGGTTGAT GTTTGACGCT 
32 041 ACAGCCATAG CCATTAATGC AGGAGATGGG CTTGAATTTG GTTCACCTAA TGCACCAAAC 
32101 ACAAATCCCC TCAAAACAAA AATTGGCCAT GGCCTAGAAT TTGATTCAAA CAAGGCTATG 
32161 GTTCCTAAAC TAGGAACTGG CCTTAGTTTT GACAGCACAG GTGCCATTAC AGTAGGAAAC 
32221 AAAAATAATG ATAAGCTAAC TTTGTGGACC ACACCAGCTC CATCTCCTAA CTGTAGACTA 
32281 AATGCAGAGA AAGATGCTAA ACTCACTTTG GTCTTAACAA AATGTGGCAG TCAAATACTT 
32341 GCTACAGTTT CAGTTTTGGC TGTTAAAGGC AGTTTGGCTC CAATATCTGG AACAGTTCAA 
32401 AGTGCTCATC TTATTATAAG ATTTGACGAA AATGGAGTGC TACTAAACAA TTCCTTCCTG 
32461 GACCCAGAAT ATTGGAACTT TAGAAATGGA GATCTTACTG AAGGCACAGC CTATACAAAC 
32521 GCTGTTGGAT TTATGCCTAA CCTATCAGCT TATCCAAAAT CTCACGGTAA AACTGCCAAA 
32581 AGTAACATTG TCAGTCAAGT TTACTTAAAC GGAGACAAAA CTAAACCTGT AACACTAACC 
32641 ATTACACTAA ACGGTACACA GGAAACAGGA GACACAACTC CAAGTGCATA CTCTATGTCA 
32701 TTTTCATGGG ACTGGTCTGG CCACAACTAC ATTAATGAAA TATTTGCCAC ATCCTCTTAC 
32761 ACTTTTTCAT ACATTGCCCA AGAATAAAGA ATCGTTTGTG TTATGTTTCA ACGTGTTTAT 
32821 TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT AGTATAGCCC CACCACCACA 
32881 TAGCTTATAC AGATCACCGT ACCTTAATCA AACTCACAGA ACCCTAGTAT TCAACCTGCC 
32941 ACCTCCCTCC CAACACACAG AGTACACAGT CCTTTCTCCC CGGCTGGCCT TAAAAAGCAT 
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33001 CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC CACACGGTTT CCTGTCGAGC 
33061 CAAACGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC TCACTTAAGT TCATGTCGCT 
33121 GTCCAGCTGC TGAGCCACAG GCTGCTGTCC AACTTGCGGT TGCTTAACGG GCGGCGAAGG 
33181 AGAAGTCCAC GCCTACATGG GGGTAGAGTC ATAATCGTGC ATCAGGATAG GGCGGTGGTG 
33241 CTGCAGCAGC GCGCGAATAA ACTGCTGCCG CCGCCGCTCC GTCCTGCAGG AATACAACAT 
33301 GGCAGTGGTC TCCTCAGCGA TGATTCGCAC CGCCCGCAGC ATAAGGCGCC TTGTCCTCCG 
33361 GGCACAGCAG CGCACCCTGA TCTCTACTTAA ATCAGCACAG TAACTGCAGC ACAGCACCAC 
33421 AATATTGTTC AAAATCCCAC AGTGCAAGGC GCTGTATCCA AAGCTCATGG CGGGGACCAC 
33481 AGAACCCACG TGGCCATCAT ACCACAAGCG CAGGTAGATT AAGTGGCGAC CCCTCATAAA 
33541 CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA TTCACCACCT CCCGGTACCA. 
33601 TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC CTAAACCAGC TGGCCAAAAC 
33661 CTGCCCGCCG GCTATACACT GCAGGGAACC GGGACTGGAA CAATGACAGT GGAGAGCCCA 
33721 GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA ATGTTGGCAC AACACAGGCA 
33781 CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC GTTAGAACCA TATCCCAGGG 
3 3 841 AACAACCCAT TCCTGAATCA GCGTAAATCC CACACTGCAG GGAAGACCTC GCACGTAACT 
33901 CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC GGATGATCCT CCAGTATGGT 
33961 AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA CTGTACGGAG TGCGCCGAGA 
34021 CAACCGAGAT CGTGTTGGTC GTAGTGTCAT GCCAAATGGA ACGCCGGACG TAGTCATATT 
34081 TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT GCGTCTCCGG TCTCGCCGCT 
34141 TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT CAAAGCATCC AGGCGCCCCC 
34201 TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC CCTGATAACA TCCACCACCG 
34261 CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG CGAGTCACAC ACGGGAGGAG 
34321 CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA AAAGATTATC CAAAACCTCA 
34381 AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG CGTGGTCAAA CTCTACAGCC 
.34441 AAAGAACAGA TAATGGCATT TGTAAGATGT TGCACAATGG CTTCCAAAAG GCAAACGGCC 
34501 CTCACGTCCA AGTGGACGTA AAGGCTAAAC CCTTCAGGGT GAATCTCCTC TATAAACATT 
34561 CCAGGACCTT CAACCATGCC CAAATAATTC TCATCTCGCC ACCTTCTCAA TATATCTCTA 
34 621 AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT GCTCCAGAGC GCCCTCCAGC 
34681 TtCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG TTCCTCACAG ACCTGTATAA 
34741 GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG TAGGTCCCTT CGCAGGGCCA 
34801 GCTGAAC ATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC CACTTCCCCG CCAGGAACCT 
34861 TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG AGCTATGCTA ACCAGCGTAG 
34921 CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT GCAAGGTGCT GCTCAAAAAA 
34981 TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT CATGCTCATG CAGATAAAGG 
35041 CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT TTCTCTCAAA CATGTCTGCG 
3 5101 GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT TTAAACATTA GAAGCCTGTC 
35161 TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC TACGGCCATG CGGGCGTGAC 
35221 CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA CAGCTCCTCG GTCATGTCCG 
35281 GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT CATCGGTCAG TGCTAAAAAG 
35341 CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGGGTA GAGACAACAT TACAGCCCCC 
35401 ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT AAACACCTGA AAAACCCTCC 
.35461 TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT ACAGCGCTTC ACAGCGGCAG 
35521 CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA AAAAAACACC ACTCGACACG 
35581 GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGG6CCAAGT GCAGAGCGAG TATATATAGG 
35641 ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC CCAGAAAACC GCACGCGAAC 
35701 CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT CAAATCGTCA CTTCCGTTTT 
35761 CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT CCCAACACAT ACAAGTTACT 
35821 CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC CGCGCCACGT CACAAACTCC 
35881 ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT ATATTATTGA TGATG 
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Western blot on whole-cell extracts from 293 ceils transfected with plasmid DNA expressing the 
different HCV NS cassettes. Mature NS3 and NSSA products were detected with specific antibodies. 
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n^Y K^Ispot on splenocytes from C57black6 mice immunized'with two iiij^tions of 25^g DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/IO* PBMC. 
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IFNy ELIspot on splenocytes from BalbC mice inununized with two injections of 50\ig DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as S¥C/l(fi PBMC 
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Western blot on whole-cell extracts from HeLa cells infected at different multiplicity of infection 

Si?Rl!!!i i^?^ t"" "^^^ Adenovectors expressing the different HCV NS cassettes. Mature 
NS5B and NS5A products were detected with specific antibodies. 
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IFNy ELISPOT on splenocytes from C57black6 mice immunized with two injections of 10^ vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/IO^ PBMC. 
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ELISPOT on PBMC from Rhesus monkeys immunized with one injection of lO'o vo/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed L SFCYIO^ PBMC 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with one injection of 10'° yp/dose of 
Adenovectors expressing the di^erent HCV NS cassettes. Data are expressed as SFC/10^ I^MC. 
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n^ir EUSPOT on PBMC from Rhesus monkeys immunized with two injections of 10" vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFaiO* PBMC. 
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BFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10'* vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10?PBMC. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10"vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10' > vp/dose of MRKAdS-NSmut. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10" • vp/dose of MRKAd6-NSmut. 
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1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
661 
651 
701 
751 
801 
851 
901 
951 

lodi 

1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 



GCCACCATGG CCCCCATGAC CGCCTACAGC CAGJCAGACCA GGGGCCTGCt 
GGGCTGCAtC ATCACCAGCC TGACCGGACG, CGACAAGAAC CAGGTGGAGG 
GAGAGGTGCA GGTGGTGAGG ACCGCTACGC AGAGCTTCCT GGCCACCTGC 
GTGAACGGGG TCiXXItGGAC CGTGTAGCAC GGAGCCGGAA GCAAGACCCT 
GGCCGGAGGC AAGGGCCCTA TCACGCAGAT GTAGACGAAT GTGGATGAGG 
ATCTGGtGGG GTGGGAGGGC GGTCCCGGAG CCAGGAGCGT GAGAGCCTGT 
ACCTGTGGAA GCAGGGAGGT GTACCTGGTG ACAGGGCACG CGGATGTGAT 
CCCCGTGAGG CGCAGGGGGG ATTCTCGGGG AAGGCTGGTG AGGGCTAGGC 
CGGTGAGCTA CCTGAAGGGC AGCAGGGGAG GACGCCTGGT GTGTCGTTCT 
GGCCATCCCG TGGGCAT^^ TCGCGGTGCd GTGTGTAGGA GGGGCGTGGC 
GAAAGCCGTG GATTTfGTGC GGGTGGAAAG CATGG&GAGC AGGATGGGCA 
GdCCTGTGTT CAGGGAGAAC AGGi^TTCGCC GTGCCGTGeC CGAATCATTC 
GAGGTGGCTG ACCtGGACGC GCGTACGGGA TCTGGCAAGA GCAGCAAGGT 
GGCGGCTGGC TAGGCGGCTC AGGGCTAGAA GGTGCfGGTG CTGAACCCCA 
GGGTGGCCGC TAGCCTGGGC TTCGGCGCTT ACATGAGCAA GGCCCATGGC 
ATCGACCCCA ACATGGGCAG AGGCGTGCGC AGCATCACCA CCGGAGCTCC 
CGTGAGCTAG AGCACCtACG GGAAGTTCCT GGCCGATGGA GGCTGCAGGG 
GAGGAGCCTA CGACATCATC ATCTGGGACG AGTGCGACAG GAGCGACAGG 
AGCAGCATCC TGGGCATTGG CACCGTGCTG GATCAGGCCG AAACAGCTGG 
AGCCAGGCTG GTGGTGCTGG GCACAGCTAC CCCTCCTGGC AGCGTGACCG 
TGCCCCATCC CAATATCGAG GAGGTGGCCC TGAGCAAGAC AGGCGAGATC 
CCGTTCTACG GCAAGGCCAT GGCCATCGAG GCCATGCGCG GAGGCAGGCA 
CGTGATCTTG TGGCAGAGGA AGAAGAAGTG CGAGGAGCTG GGTGCGAAGC 
TGAGCGGACT GGGCATCAAC GCCGTGGCGT ACTAGAGGGG CCTGGAGGTG 
TGAGTGATCC CCAGCATGGG GGATGTGGTG GTGGTGGCGA GGGACGGCGT 
GATGACAGGC TACACGGGAG AGTTCGAGAG CGTGATGGAG TGCAACACCT 
GCGTGACCCA GAGGGTGGAG TTGAGGCTGG AGCGCAGGTT CACGATGGAA 
AGCAGCACGG TGGCTCAGGA TGGTGTGAGG AGGAGCGAGA GGCGCGGACG 
CAGGGGAAGG GGCAGGCGCG GAATTTATGG GTTTGTGACC CCTGGCGAAA 
GGGGGtGTGG GATGTTCGAG AGCAGCGTGC TGTGCGAGTG CTAGGACGGtP 
GGCTGGGGTT GGTACGAGGT GACAGGCGGT GAAAGGAGCG TGCGCCTGCG 
GGCTTATCTG AATACGGCTG GCGTGGCGGT GTGTCAGGAG GACGTGGAGT 
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1601 


TCTGGGAGAG CGTGTTCACA GGACTGACCC 


ACATCGACGC 


CCATTTCCTG 


1651 


AGCCAGACCA AGCAGGCTGG CGACAACTTC 


CCCTATCTGG 


TGGCCTATCA 


1701 


GGCCACC6TG TGTGCTAGGG CCCAAGCTCC 


ACCTCCTTCA 


TGGGACCAGA 


1751 


TGTGGAAGTG CCTGATCCGC CTGAAGCCCA 


CCCTGCACGG 


CCCTACCCCT 


1801 


CTGCTGTACC GCCTGGGAGC CGTGCAGAAC 


GAGGT6ACCC 


TGACCCACCC 


1851 


CATCACCAAG TACATCATGG CCTGCATGAG 


CGCTGATCTG 


GAAGTGGTGA 


1901 


CCAGCACCTG GGTGCTGGTG GGAGGCGTGC 


TGGCCGCTCT 


GGCTGCCTAC 


1951 


TGCCTGACCA CCGGAAGCGT GGTGATCGTG 


GGACGCATCA 


TCCTGAGCGG 


2001 


AAGGCCCGCT ATCGTGCCCG ATCGCGAGTT 


CCTGTACCAG 


GAGTTCGAC6 


2051 


AGATGGAGGA GTGTGCCA6C CACCTGCCCT 


ACATCGA6CA 


GGGCATGCAG 


2101 


CTGGCCGAAC AGTTCAAGCA GAAGGCCCTG GGCGTGCTGC 


AGACAGCCAC 


2151 


CAAACAGGCC GAAGCTGCCG CTCCCGTGGT GGA?^GCAAG 


TGGAGGGCCC 


2201 


TGGAGACCTT CTGG6CTAAG CACATGTGGA ACTTCATCTC 


TGGCATCCAG 


2251 


TACCTGGCCG GACTGAGCAC CCTGCCTGGC 


AACCCCGCTA 


TCGCCAGCCT 


2301 


GATGGCCTTC ACCGCTAGCA TCACCTCTCC 


CCTGACCACC 


CAGAGCACCC 


2351 


TGCTGTTCAA CATTCTGGGC GGATGGGTGG 


CCGCTCAGCT 


GGCCCCTCCT 


2401 


TCAGCTGCTT CTGCCTTTGT GGGCGCTGGC 


ATTGCCGGAG 


CCGCTGTGGG 


2451 


CAGCATTGGC. CTGGGCAAAG TGCTGGTGGA 


TATTCTGGCT 


GGCTATGGCG 


2501 


CTGGCGTGGC CGGAGCCCTG GTGGCCTTCA 


AGGTGATGAG 


CGGAGAGATG 


2551 


CCCAGCACCG AGGACCTGGT GAACCTGCTG 


CCTGCCATTC 


TGAGCCCTGG 


2601 


AGCCCTGGTG GTGGGCGTGG TGTGTGCTGC 


CATTCTGAGG 


CGCCATGTGG 


2651 


GACCCGGAGA GGGCGCTGTG CAGTGGATGA 


ACCGCCTGAT 


CGCCTTCGCC 


2701 


TCTCGCGGAA ACCACGT6AG CCCTACCCAC 


TACGTGCCTG 


AGAGCGACGC 


2751 


CGCTGCCAGG GTGACCCAGA TCCTGAGCAG 


CCTGACCATC 


ACCCAGCTGC 


2801 


TGAAGCGCCT GCACCAGTGG ATCAACGAGG ACTGCAGCAC 


ACCCTGCAGC 


2851 


GGAAGCTGGC TGAGGGACGT GTGGGACTGG 


ATCTGCACCG 


TGCTGACCGA 


2901 


CTTCAAGACC TGGCTGCAGA GCAAGCTGCT 


GCCCCAACTG 


CCTGGCGTGC 


2951 


CCTTCTTCTC ATGCCAGCGC GGATACAAGG 


GCGTGTGGAG 


GGGCGATGGC 


3001 


ATCATGCA6A CCACCTGTCC CTGCGGAGCC 


CAGATCACAG 


GCCACGTGAA 


3051 


GAACGGCAGC ATGCGCATCG TGGGCCCTAA 


GACCTGCAGC 


AACACCTGGC 


3101 


ACGGCACCTT CCCCATCAAC GCCTACACCA 


CCGGACCCTG 


CACACCCAGC 


3151 


CCTGCTCCCA ACTACAGCAG GGCCCTGTGG 


AGGGTGGCTG 


CCGAGGAGTA 



FIG. 20B 



wo 03/031588 



PCt/US02/32512 



91/92 



3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 
4251 
4301 
4351 
4401 
4451 
4501 
4551 
4601 
4651 
4701 
4751 



cgtggaggtg accagcsgtcg gagacttcca ctacgtgacc g6aatgacca 
cggacAacgt gaagtgtccc tgtcaggtgc ccgctcccga attttttacc 
gaagtggatg gcgtgcggct' ggatcgctat gcccctggct gtaggcccct 
gctgggggaa gaagtgacct tccaggtggg cctgaaccag tacctggtgg 
gcagccaggt gcgctgggag cctgagcgcg atgtggccgt ggtgaccagc 
atgctgaccg accgcagcca catcagagcc gaaaccgcta aaaggcgcct 

GGCCAGGGGC TCTCCTCCAA GCCTGGCCTC AAGCAGCGCT' AGGCAGCTGT 
CTGCTCCGAe CCTGAAGGGC AGCTGGAGGA CCGACCACGT GAGCCCCGAC 
GCCGAGGTCSA TCGAGGCCAA CCTGCTGTGG CGCCAGGAGA TGGGCGGGAA 
CATCAGGGGG GTGGAGAGCG AGAACAAGGT GGTGGTGGTG GAGAGCTTCG 
ACCCCGTGCG GGCCGAGiSAG GAGGAGCGCG AGGTGAGCGT GCGCGCGGAG 
ATCCTGGGCA AGAGCAAGAA GTTGCGCGCT GCCATGCGGA TCTGGGGTAG 
ACCTGATTAC AAGCCTCCCG TGGTGGAGAG GTGGAAGGAC GCTGATTAGG 
TGCCTCCAGT GGTGCATGGG TGTCCTGTGC CTCCCATTAA' AGCCCCTCCT 
ATTCCACCTC CTAGGGGCAA AAGGAGGGTG GTGCTGACAG AAAGCAGCGT 
GAGCTCtGCT GTGGCCGAAC TGGGGAGCAA GACCTTTGGC AGCAGCGAGA 
GGTCTGCCGT GGAGAGCGGA AGAGGCACCG GTCTGCCTGA- CCAGGCGAGC 
GAGGACGGGG ATAAGGGCAG CGATGTGGAG AGGTATAGCA GCATGCGTCC 
CCTGGAAGGG GAACCTGGCG ATGGGGATCT GAGCGATGGG AGCTGGAGCA 
GCGTGAGCGA AGAGGCCAGC GAGGACGTGG TGTGTTGCAG CATGAGGTAC 
AGCTGGACAG GCGGTCTGAT CACAGGGTGC GCTGCCGAGG AGAGCAAGCT 
Ggccatcaag GCCCTGAGCA ACAGGCTGCT GAGGCACCAC AACATGGTGT 
AGGCCACCAC GAGCAGGTCT GGCGGAGTGA- GGGAGAAGAA GGTGACGTTC 
GACCGCGTGC AGGTGCTGGA CGACCAGTAG GGCGATGTGC TGAAGGAGAT 
GAAGGCCAAG GCCAGCACCG TGAAGGCCAA GCTGCTGAGC GTGGAGGAGG 
CGTGCAAGCT GACCCCCCCC CACAGCGCCA AGAGCAAGTO CGGCTACGGC 
GCCAAGGACG TGCGCAACCT GAGCAGCAAG GCCGTGAACC ACATCCACAG 
CGTGTGGAAG GACCTGCTGG AGGACACCGT GACCCCCATC GACACCACCA 
TCATGGCCAA GAACGAGGTG TTCTGCGTGC AGCCCGAGAA GGGCGGCCGC 
AAGCCCGCTC GCCTGATCGT GTTCCCCGAT CTGGGCGTGC GCGTGTGCGA 
GAAGATGGCC CTGTACGACG TGGTGAGCAC CCTGCCTCAG GTGGTGATGG 
GCTCAAGCTA CGGCTTCCAG TACAGCCCTG GCCAGCGCGT GGAGTTCCTG 
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4801 GTGAACACCT GGAAGAGCAA GAAGAACCCC ATGGGCTTCA GCTACGACAC 
4851 ACGCTGCTTC GACAGCACCG TGACCGAGAA CGACATCCGC GTGGAGGAGA 
4901 GCATCTACCA GTGCTGCGAC CTGGCCCCTG AGGCCAGGCA GGCCATCAAG 
4951 AGCCTGACCG AGCGCCTGTA CATCGGAGGC CCTCTGACCA ACAGCAAGGG 
5001 ACAGAACTGC GGATACAGGC GCTGTAGGGC CTCTGGCGTG CTGACCACCA 
5051 GCTGTGGCAA CACCCTGACC TGCTACCTGA AGGCCAGCGC TGCCTGTCGC 
5101 GCTGCCAAGC T6CAGGACTG CACCATGCTG GTGAACGCCG CTGGCCTGGT 
5151 GGTGATTTGT GAAAGCGCTG GCACCCAGGA AGATGCTGCC AGCCTGCGCG 
5201 TGTTCACCGA GGCCATGACC AGGTACTCTG CCCCTCCCGG AGACCCCCCT 
5251 CAGCCCGAAT ACGACCTGGA GCTGATCACC AGCTGCTCAA GCAACGTGAG 
5301 C6TGGCTCAC GACGCCAGCG GAAAGCGCGT GTACTACCTG ACACGCGATC 
5351 CCACCACCCC TCTGGCTCGC GCTGCCTGGG AAACCGCTCG CCATACACCC 
5401 GTGAACA6CT GGCTGGGCAA CATCATCATG TACGCCCCTA CCCTGTGGGC 
5451 TCGCATGATC CTGATGACCC ACTTCTTCAG CATCCTGCTG GCTCAGGAGC 
5501 AGCTGGA6AA GGCCCTGGAC TGCCAGATTT ACGGCGCTTG CTACAGCATC 
5551 GAGCCCCTGG ACCTGCCCCA AATCATCGAG CGCCTGCACG GCCTGTCTGC 
5601 CTTCAGCCTG CACAGCTACA GCCCTGGCGA AATTAATCGC GTGGCCAGCT 
5651 GTCTGCGCAA ACTGGGCGTG CCTCCTCTGC GCGTGTGGAG GCATAGGGCT 
5701 AGGAGCGTGA GGGCTAGGCT GCTGAGCCAG GGAGGCAGGG CCGCTACCTG 
5751 TGGAAAGTAC CTGTTCAACT GGGCCGTGAA GACCAAGCTG AAGCTGACCC 
5801 CTATCCCTGC CGCTAGCCAG CTGGACCTGA GCGGATGGTT CGTGGCTGGC 
5851 TACAGCGGAG GCGACATCTA CCACAGCCTG TCTCGCGCTC GCCCTCGCTG 
5901 GTTCATGCTG TGCCTGCTGC TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 
5951 TGCCCAACCG CTAAA 
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OF THE UNITED STATES PATENT AND TRADEMARK OFFICE 
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Assistant Commissioner of Patents 
BOXPCT 
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I hereby state that the content of the paper and computer readable forms of the 
Sequence Listing, submitted in accordance with WIPO and Standard ST.23 and under PCT 
Rule 13ter.l, respectively, are the same. 
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SEQUENCE LISTING 

"".^"^ S^Jeletti''s.p''r' ^«"t«to Di Ricerche Di Biologia Molecolare P. 

<120> HEPATITIS C VIRUS VACCINE 
<130> ITROOISY 

<150> 60/363,774 
<151> 2002-03-13 

<150> 60/328,655 
<151> 2001-10-11 

<160> 17 

:<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1985 
-<212> PRT 

<213> Artificial Sequence 

,<220> 

<223> Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 



10 15 
Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 

25 30 
Ser Thr Ala thr Gin Ser Phe Leu Ala Thr Cys 

40 45 
Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 

55 60 
Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
70 75 80 

Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 

90 95 
Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 

i05 110 
Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 

120 125 
Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 

135 140 
His Ala Val Gly He Phe Arg Ala Ala Val Cys 
150 155 160 

Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 

170 175 
Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 

185 190 
Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
155 200 205 



. <400> 1 








Met 


Ala 


Pro 


He 


Thr 


; 1 








5 


Cys 


He 


He 


Thr 


Ser 








20 




Glu 


Val 


Gin 


Val 


Val 






35 






Val 


Asn 


Gly 


Val 


Cys 




50 








Leu 


Ala 


Gly 


Pro 


Lys 


65 








Gin Asp 


Leu 


Val 


Gly 










85 


Pro 


Cys 


Thr 


Cys 


Gly 








100 




Asp 


Val 


He 


Pro 


Val 






115 






Ser 


Pro 


Arg 


Pro 


Val 




130 








Leu 


Cys 


Pro 


Ser 


Gly 


145 








Thr Arg 


Gly 


Val 


Ala 










165 


Glu 


Thr 


Thr 


Met 


Arg 








180 




Ala 


Val 


Pro 


Gin 


Ser 
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Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 

210 215 220 

Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly 

245 250 255 

Val Arg Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 

260 265 270 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie 

275 280 285 

lie Cys Aspi Giu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie 

290 295 300 

Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 

325 330 335 

lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly 

340 345 350 

Lys Ala lie Pro He Glu Ala He Arg Gly Gly Arg His Leu lie Phe 

355 360 365 

Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 

370 375 380 

Leu Gly He Asri Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Mefe 

405 410 415 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 

420 425 430 

Vai Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 

435 440 445 

Thr Thr Thr Val Pro Glri Asp Ala Val Ser Arg Ser Gin Arg Arg Gly - 

450 455 460 

Arg Thr Gly Arg Gly Arg Arg Gly I-le Tyr Arg Phe Val Thr Pro Gly 
465 470 , 475 . 480 

Glu Arg iPrb Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 

485 490 495 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 

500 505 510 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 

515 520 525 

His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 

53 0 535 540 

Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asri Phe Pro Tyr 
545 550 555 560 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 

565 570 575 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 

580 585 590 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Glzli Asn 

595 600 605 

Glu Val Thr Leu Thr His Pro He Thr Lys Tyr lie Met Ala Cys Met 

610 615 620 

Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 
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Vai Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
■ • , 650 655 

lie Val Gly Arg lie He Leu Ser Gly Arg Pro Ala He Val Pro Asp 

660 665 670 

Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 

675 680 685 

His Leu Pro Tyr lie Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 

690 695 700 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
■705 710 715 720 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 

725 730 735 

Ala Lys His Met Trp Asn Phe He Ser Gly lie Gin Tyr Leu Ala Gly 

740 745 750 

Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 

755 760 765 

Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
r 770 775 780 

Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
'785 790 795 800 

Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 

805 810 815 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 

820 825 830 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 

835 840 845 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 

850 855 860 

Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
865 870 875 880 

Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 

885 890 895 

Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 

900 905 910 

Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He 

915 920 925 

Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser 

930 935 940 

Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He Cys 

950 955 960 

Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 

965 970 975 

Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 

980 985 990 

Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 

995 1000 1005 

Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 

1010 1015 1020 

Lys Thr Cys Ser Asn Thr Trp . His Gly Thr Phe Pro He Asn Ala Tyr 
i?25 1030 1035 1^40 

Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 

1045 1050 1055 

Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly 
1060 1065 1070 
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Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 

1075 1080 1085 

Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Giu Val Asp Gly Val Arg 

1090 1095 1100 

Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 112.0 

Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 

1125 1130 1135 

Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 

1140 1145 1150 

Pro Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 

1155 1160 1165 

Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

1170 1175 1180 

Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 1200 

Lexi lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 

1205 1210 1215 

thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 

1220 1225 1230 

Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 

1235 1240 1245 

He Leu Arg Lys Ser Lys Lys Phe Pro Alst .Ala Met Pro He Trp* Ala 

1250 1255 1260 

Arg Pro Asp Tyr Asn Pro Pro Leu teu Glu Ser trp Lys Asp Pro Asp 
1265 1270 1275 1280 

Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 

1285 1290 1295 

Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr- Val Val Leu Thr Glut 

1300 1305 1310 

S6r Ser Val Ser Ser Ala Leu Ala Glu Leu Ala- thr Lys Thr Phe Gly 

1315 1320 1325 

Ser Ser Glu Ser Ser Ala Val Asp Ser Gly T^ir Ala Thr Ala Leu Pro: 

1330 1335 1340 

Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser tyr 
1345 1350 1355 1360 

Ser Ser Met Pro Pro Leu Glu Gly Glu Pro* Gly Asp Pro Asp Leu Ser 

1365 1370 1375 

Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

1380 1385 1390 

Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

1395 1400 1405 

Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu 

• 141Q 1415 1420 

Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
. 1425 1430 1435 1440 

Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 

1445 1450 1455 

His tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 

1460 1465 1470 

Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cy^ Lys Leu Thr Prd Pro 

1475 1480 1485 

His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ash 
1490 1495 1500 
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Leu Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu 

1510 1515 
Leu Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn 

1525 1530 1535 

Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 

1540 1545 j^ggQ 

Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

1555 1560 1565 

Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 

•"■^'O 1575 1580 

Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 

1590 1595 1600 

Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 

1610 1615 
Cys Phe ASP Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser 

1^20 1625 1630 

lie Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 
1635 1640 1645 

^^"^ ^ViJ^"^ ^^"^ ^^"^ <31y Pro Leu Thr Asn Ser Lys 

1^^" 1655 1660 

Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 

Z l^'^O 1575 ifion 

Thr Ser Cys, Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 

1685 1690 1695 

Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Ala Ala 

1700 1705 1710 

Gly Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 
■'•'■'•5 1720 1725 

*^ Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 
. 1730 1735 1740 

Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
C O 1755 1760 

Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 

1765 1770 
Tyr Leu Thr Arg Asp Pro Thr thr Pro Leu Ala Arg Ala Ala Trp Glu 

1780 1785 1790 

Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 
1795 1800 1805 

^Ifn^""" "^"^ He Leu Met Thr His Phe Phe 

^810 1815 1820 

ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
;f , 1835 1840 

He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 

1845 1850 1855 

He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 

I860 1865 1870 

Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 
1875 1880 1885 

foon^®" ^9 Ala Arg Ser Val Arg Ala Arg 

1890 1895 1900 

Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 

I rr, , ^^^^ 1915 1920 

Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 
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Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 

1940 1945 1950 

Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 

1955 1960 1965 

Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 

Arg 
1985 

<2l6> 2 

<211> 5965 • / 

<212> DNA 

<213> Artificial Sequence 

<220> . 
<223> Non-optimized cDNA sequence encoding SEQ. ID. NO. 

' . 1 • ' 

<400> 2 

gccaccatgg cgcccatcac ggcctactcc caacagacgc ggggcctact tggttgcatc 60 

atcactagcc ttacaggccg ggacaagaac caggtcgagg gagaggttca ggtggtttcc 120 

accgcaacac aatccttcct ggcgacctgc gtcaacggcg tgtgttggac cgtttaccat 180 

ggtgctggct caaagacctt agccggccca aaggggccaa tcacccagat gtacactaat 240 

gtggaccagg acctcgtcgg ctggcaggcg ccccccgggg cgcgttcctt gacaccatgc 300 

acctgtggca gctcagacct ttacttggtc acgagacatg ctgacgtcat tccggtgcgc 360 

cggcggggcg acagtagggg gagcctgctc tcccccaggc ctgtctccta cttgaagggc 420 

tcttcgggtg gtccactgct ctgcccttcg gggcacgctg tgggcatctt ccgggctgcc 480 

gtatgcaccc ggggggttgc gaaggcggtg gactttgtgc ccgtagagtc catggaaact 540 

actatgcggt ctccggtctt cacggacaac tcatcccccc cggccgtacc gcagtcattt 600 

caagtggccc acctacacgc tcccactggc agcggcaaga gtactaaagt gccggctgca 660 

tatgcagccc aagggtacaa ggtgctcgtc ctcaatccgt ccgttgccgc taccttaggg 720 

tttggggcgt atatgtctaa ggcaicacggt attgacccca acatcagaac tggggtaagg 780 

accattacca caggcgcccc cgtcacatac tctacctatg gcaagtttct tgccgatggt 840 

ggttgctctg ggggcgctta tgacatcata atatgtgatg agtgccattc aactgactcg 900 

actacaatct tgggcatcgg cacagtcctg gaccaagcgg agacggctgg agcgcggctt 960 

gtcgtgctcg ccaccgctac gcctccggga tcggtcaccg tgccacaccc aaacatcgag 102 0 

gaggtggccc tgtctaatac tggagagatc cccttctatg gcaaagccat ccccattgaa 1080 

gccatcaggg ggggaaggca tctcattttc tgtcattcca agaagaagtg cgacgagctc 1140 

gccgcaaagc tgtcaggcct cggaatcaac gctgtggcgt attaccgggg gctcgatgtg 1200 

tccgtcatac caactatcgg agacgtcgtt gtcgtggcaa cagacgctct gatgacgggc 1260 

tatacgggcg actttgactc agtgatcgac tgtaacacat gtgtcaccca gacagtcgac 1320 

ttcagcttgg atcccacctt caccattgag acgacgaccg tgcctcaaga cgcagtgtcg 1380 

cgctcgcagc ggcggggtag gactggcagg ggtaggagag gcatctacag gtttgtgact 1440 

ccgggagaac ggccctcggg catgttcgat tcctcggtcc tgtgtgagtg ctatgacgcg 1500 

ggctgtgctt ggtacgagct cacccccgcc gagacctcgg ttaggttgcg ggcctacctg 1560 

aacacaccag ggttgcccgt ttgccaggac cacctggagt tctgggagag tgtcttcaca 1620 

ggcctcaccc acatagatgc acacttcttg tcccagacca agcaggcagg agacaacttc 1680 

ccctacctgg tagcatacca agccacggtg tgcgccaggg ctcaggcccc acctccatca 1740 

tgggatcaaa tgtggaagtg tctcatacgg ctgaaaccta cgctgcacgg gccaacaccc 1800 

ttgctgtaca ggctgggagc cgtccaaaat gaggtcaccc tcacccaccc cataaccaaa 1860 

tacatcatgg catgcatgtc ggctgacctg gaggtcgtca ctagcacctg ggtgctggtg 1920 

ggcggagtcc ttgcagctct ggccgcgtat tgcctgacaa caggcagtgt ggtcattgtg 1980 

ggtaggatta tcttgtccgg gaggccggct attgttcccg acagggagtt tctctaccag 2040 

gagttcgatg aaatggaaga gtgcgcctcg cacctccctt acatcgagca gggaatgcag 2100 

ctcgccgagc aattcaagca gaaagcgctc gggttactgc aaacagccac caaacaagcg 2160 
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gaggctgctg 
cacatgtgga 
aaccccgcaa 
caaagtaccc 
agcgccgctt 
cttgggaagg 
gtggccttca 
cctgccatcc 
cgacacgtgg 
tcgcggggta 
gttactcaga 
attaatgaag 
atatgcacgg 
ccgggagtcc 
atcatgcaaa 
atgaggatcg 
gcatacacca 
cgggtggccg 
ggcatgacca 
gaggtggacg 
gaggttacat 
cccgaaccgg 
gaaacggcta 
agccagttgt 
gctgacctca 
gtggagtcgg 
gatgagaggg 
gcgatgccca 
ccggactacg 
ataccacctc 
ttagcggagc 
acggcgaccg 
tcgtactcct 
tcttggtcta 
acatggacag 
gcgttgagca 
gcaggcctgc 
cgggacgtgc 
gtagaggaag 
gcaaaggacg 
gacttgctgg 
ttctgtgtcc 
ctgggagtcc 
gtcgtgatgg 
gtgaatacct 
gactcaacgg 
ttggcccccg 
cctctgacta 
ctgacgacta 
gctgcgaagc 
gaaagcgcgg 
aggtactctg 
tcatgttcct 
acccgtgatc 
gttaactcct 



ctcccgtggt 
atttcatcag 
tagcatcatt 
tcctgtttaa 
cggctttcgt 
tgcttgtgga 
aggtcatgag 
tctctcctgg 
gtccgggaga 
atcatgtttc 
tcctctccag 
actgctccac 
tgttgactga 
cttttttctc 
ccacctgccc 
tcgggcctaa 
cgggcccctg 
ctgaggagta 
ctgacaacgt 
gagtgcggtt 
tccaggtcgg 
atgtagcagt 
agcgtaggtt 
ctgcgccttc 
tcgaggccaa 
agaacaaggt 
aagtatccgt 
tctgggcgcg 
tccctccggt 
cacggagaaa 
tcgctactaa 
cccttcctga 
ccatgccccc 
ccgtgagcga 
gcgccttgat 
actctttgct 
ggcagaagaa 
tcaaggagat 
cctgcaagct 
tccggaacct 
aagacactgt 
aaccagagaa 
gtgtatgcga 
gctcctcata 
ggaaatcaaa 
tcaccgagaa 
aagccagaca 
attcaaaagg 
gctgcggtaa 
tccaggactg 
gaacccaaga 
ccccccccgg 
ccaatgtgtc 
ccaccacccc 
ggctaggcaa 



ggagtccaag 
cgggatacag 
gatggcattc 
catcttgggg 
gggcgccggc 
cattctggcg 
cggcgagatg 
cgccctggtc 

gggggctgtg 

ccccacgcac 

ccttaccatc 



accgtgttcc 
cttcaagacc 
gtgccaacgc 
atgtggagca 
gacctgcagc 
cacaccctct 
cgtggaggtc 
aaagtgccca 
gcacaggtac 
gctcaaccaa 
gctcacttcc 
ggccaggggg 
cttgaaggcg 
cctcctgtgg 
ggtagtcctg 
tccggcggag 
cccggattac 
ggtgcacggg 
gaggacggtt 
gaccttcggc 
ccaggcctcc 
ccttgagggg 
ggaagctagt 
cacgccatgc 
gcgccaccat 
ggtcaccttt 
gaaggcgaag 
gacgccccca 
ate cage aag 
gacaccaatt 
aggaggccgt 
gaagatggcc 
cggattccag 
gaaaaacccc 
cgacatccgt 
ggccataaaa 
gcagaactgc 
caccctcaca 
cacgatgctc 
ggacgcggcg 
ggacccgccc 
ggtcgcccac 
cctcgcacgg 
cattatcatg 



tggcgagccc 
tacttagcag 
acagcctcta 
gggtgggtgg 
atcgccggtg 
ggttatggag 
ccctccaccg 
gtcggggtcg 
cagtggatga 
tatgtgcctg 
actcagctgc 
ggctcgtggc 
tggctccagt 
gggtacaagg 
cagatcaccg 
aacacgtggc 
ccagcgccaa 
acgcgggtgg 
tgccaggttc 
gctccggcgt 
tacctggttg 
atgctcaccg 
tctcccccct 
acatgcacta 
cggcaggaga 
gactctttcg 
atcctgcgga 
aaccctccac 
tgcccgttgc 
gtcctaacag 
agctccgaat 
gacgacggtg 
gaaccggggg 
gaggatgtcg 
gctgcggagg 
aacatggttt 
gacagactgc 
gcgtccacag 
cattcggcca 
gccgttaacc 
gacaccacca 
aagccagccc 
ctctatgatg 
tactctcctg 
atgggctttt 
gttgaggagt 
tcgctcacag 
ggttatcgcc 
tgttacttga 
gtgaacgccg 
agcctacgag 
caaccagaat 
gatgcatcag 
gctgcgtggg 
tatgcgccca 



ttgagacatt 
gcttatccac 
tcaccagccc 
ctgcccaact 
cggctgttgg 
caggagtggc 
aggacctggt 
tgtgtgcagc 
accggctgat 
agagcgacgc 
tgaaaaggct 
taagggatgt 
ccaagctcct 
gagtctggcg 
gacatgtcaa 
atggaacatt 
actattctag 
gggatttcca 
cggctcctga 
gcaggcctct 
ggtcacagct 
acccctccca 
ccttggccag 
cccaccatgt 
tgggcgggaa 
acccgcttcg 
aatccaagaa 
tgttagagtc 
cacctatcaa 
agtcctccgt 
catcggccgt 
acaaaggatc 
accccgatct 
tctgctgctc 
aaagcaagct 
atgccacaac 
aagtcctgga 
ttaaggctaa 
aatccaagtt 
acatccactc 
tcatggcaaa 
gccttatcgt 
tggtctccac 
ggcagcgagt 
catatgacac 
caatttacca 
agcggcttta 
ggtgccgcgc 
aggcctctgc 
ccggccttgt 
tcttcacgga 
acgacttgga 
gcaaaagggt 
aaacagctag 
ctttgtgggc 



ctgggcgaag 
tctgcctggg 
gctcaccacc 
cgcccccccc 
cagcataggc 
cggcgcgctc 
caatctactt 
aatactgcgt 
agcgttcgcc 
cgcagcgcgt 
ccaccagtgg 
ttgggactgg 
gccgcagcta 
gggagacggc 
aaacggttcc 
ccccatcaac 
ggcgctgtgg 
ctacgtgacg 
attcttcacg 
cctacgggag 
accatgcgag 
catcacagca 
ctcttcagct 
ctctccggac 
catcacccgc 
agcggaggag 
gttccccgca 
ctggaaggac 
ggcccctcca 
gtcttctgcc 
cgacagcggc 
cgacgttgag 
cagtgacggg 
aatgtcctac 
gcccatcaac 
atctcgcagc 
cgaccactac 
actcctatcc 
tggctatggg 
cgtgtggaag 
aaatgaggtt 
attcccagat 
ccttcctcag 
cgagttcctg 
tcgctgtttc 
atgttgtgac 
tatcgggggt 
gagcggcgtg 
agcctgtcga 
cgttatctgt 
ggctatgact 
gctgataaca 
gtactacctc 
acacactcca 
aaggatgatt 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ctgatgactc acttcttctc catccttcta gcacaggagc aacttgaaaa agccctggac • 5520 . 

tgccagatct acggggcctg ttactccatt gagccacttg acctacctca gatcattgaa 5580 

cgactccatg gccttagcgc attttcactc tatagttact ctccaggtga gatcaatagg 5640 

gtggctccat gcctcaggaa acttggggta ccacccttgc gagtctggag acatcgggcc 5700 
aggagcgtcc gcgctaggct actgtcccag ggggggaggg ccgccacttg tggcaagtac . 5760; 

ctcttcaact gggcagtgaa gaccaaactc aaactcactc caatcccggc tgcgtcccag 5820- 

ctggacttgt ccggctggtt cgttgctggt tacagcgggg gagacatata tcacagcctg 5880 ■ 

tctcgtgccc gaccccgctg gttcatgctg tgcctactcc taictttctgt aggggtaggc 5940 

atctacctgc tccccaaccg ataaa 5965 

<210> 3 ' 

<211> 5965 
<212> DNA 

<213> Artificial Sequencef 

<220> ■ . 

<223> Ojptimized cDNA encoding SEQ ID NO: 1 

<4P0> 3 

gccaccatgg cccccatcac cgcctacagc cagcagaccc gcggcctgct gggctgcatc 60 

atcaccagcc tgaccggccg cgacaagaac caggtggagg gcgaggtgea ggtggtgagc 120 

accgccaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggcgccggca gcaagaccct ggccggcccc aagggcccca tcacccagat gtacaccaac 240 

gtggaccagg acctggtggg ctggcaggcc ccccccggcg cccgcagcct gaccccctgc 300 

acctgcggca gcagcgacct gtacctggtg acccgccacg ccgacgtgat ccccgtgcgc 360 

cgccgcggcg acagccgcgg cagcctgctg agcccccgcc ccgtgagcta cctgaagggc 420 

agcagcggcg gccccctgct gtgccccagc ggccacgccg tgggcatctt ccgcgccgcc 480 

gtgtgcaccc gcggcgtggc caaggccgtg gacttcgtgc ccgtggagag catggagacc 540 

accatgcgca gccccgtgtt caccgacaac agcagccccc ccgccgtgcc ccagagcttc 600 

caggtggccc acctgcacgc ccccaccggc agcggcaaga gcaccaaggt gcccgccgcc 660: 

tacgccgccc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc caccctgggc 720 . 

ttcggcgcct acatgagcaa ggcccacggc atcgacccca acatccgcac cggcgtgcgc 780 

accatcacca ccggcgcccc cgtgacctac agcacctacg gcaagttcct ggccgacggc 840 

ggctgcagcg gcggcgccta cgacatcatc atctgcgacg agtgccacag caccgacagc 900 

accaccatcc tgggcatcgg caccgtgctg gaccaggccg agaccgccgg cgcccgcctg 960 

gtggtgctgg ccaccgccac cccccccggc agcgtgaccg tgccccaccc caacatcgag 1020 

gaggtggcpc tgagcaacac cggcgagatc cccttctacg gcaaggccat ccccatcgag 1080 

gccatccgcg gcggccgcca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140. 

gccgccaagc tgagcggcct gggcatcaac gccgtggcct actaccgcgg cctggacgtg 1200 

agcgtgatcc ccaccatcgg cgacgtggtg gtggtggcca ccgacgccct gatgaccggc 1260 

tacaccggcg acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgag accaccaccg tgccccagga cgccgtgagc 1380 

cgcagccagc gccgcggccg caccggccgc ggccgccgcg gcatctaccg cttcgtgacc 1440 

cccggcgagc gccccagcgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgcc 1500 

ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980' 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 
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gaggccgccg 
cacatgtgga 
aaccccgcca 
cagagcaccc 
agcgccgcca 
ctgggcaagg 
gtggccttca 
cccgccatcc 
cgccacgtgg 
agccgcggca 
gtgacccaga 
atcaacgagg 
atctgcaccg 
cccggcgtgc 
atcatgcaga 
atgcgcatcg 
gcctacacca 
cgcgtggccg 
ggcatgacca 
gaggtggacg 
Saggtgacct 
cccgagcccg 
gagaccgcca 
agccagctga 
gccgacctga 
gtggagagcg 
gacgagcgcg 
gccatgccca 
cccgactacg 
atcccccccc 
ctggccgagc 
accgccaccg 
agctacagca 
agctggagca 
acctggaccg 
gccctgagca 
gccggcctgc 
cgcgacgtgc 
gtggaggagg 
gccaaggacg 
gacctgctgg 
ttctgcgtgc 
ctgggcgtgc 
gtggtgatgg 
gtgaacacct 



gacagcaccg 
ctggcccccg 
cccctgacca 
ctgac caeca 
gccgccaagc 
gagagcgccg 
cgctacagcg 
agctgcagca 
acccgcgacc 
gtgaacagct 



cccccgtggt 
acttcatcag 
tcgccagcct 
tgctgttcaa 
gcgccttcgt 
tgctggtgga 
aggtgatgag 
tgagccccgg 
gccccggcga 
accacgtgag 
tcctgagcag 
actgcagcac 
tgctgaccga 
ccttcttcag 
ccacctgccc 
tgggccccaa 
ccggcccctg 
ccgaggagta 
ccgacaacgt 
gcgtgcgcct 
tccaggtggg 
acgtggccgt 
agcgccgcct 
gcgcccccag 
tcgaggccaa 
agaacaaggt 
aggtgagcgt 
tctgggcccg 
tgccccccgt 
cccgccgcaa 
tggccaccaa 
ccctgcccga 
gcatgccccc 
ccgtgagcga 
gcgccctgat 
acagcctgct 
gccagaagaa 
tgaaggagat 
cctgcaagct 
tgcgcaacct 
aggacaccgt 
agcccgagaa 
gcgtgtgcga 
gcagcagcta 
ggaagagcaa 
tgaccgagaa 
aggcccgcca 
acagcaaggg 
gctgcggcaa 
tgcaggactg 
gcacccagga 
ccccccccgg 
gcaacgtgag 
ccaccacccc 
ggctgggcaa 



ggagagcaag 
cggcatccag 
gatggccttc 
catcctgggc 
gggcgccggc 
catcctggcc 
cggcgagatg 
cgccctggtg 
gggcgccgtg 
ccccacccac 
cctgaccatc 
cccctgcagc 
cttcaagacc 
ctgccagcgc 
ctgcggcgcc 
gacctgcagc 
cacccccagc 
cgtggaggtg 
gaagtgcccc 
gcaccgctac 
cctgaaccag 
gctgaccagc 
ggcccgcggc 
cctgaaggcc 
cctgctgtgg 
ggtggtgctg 
gcccgccgag 
ccccgactac 
ggtgcacggc 
gcgcaccgtg 
gaccttcggc 
ccaggccagc 
cctggagggc 
ggaggccagc 
caccccctgc 
gcgccaccac 
ggtgaccttc 
gaaggccaag 
gacccccccc 
gagcagcaag 
gacccccatc 
gggcggccgc 
gaagatggcc 
cggcttccag 
gaagaacccc 
cgacatccgc 
ggccatcaag 
ccagaactgc 
caccctgacc 
caeca tgctg 
ggacgccgcc 
cgaccccccc 
cgtggcccac 
cctggcccgc 
catcatcatg 



tggcgcgccc 
tacctggccg 
accgccagca 
ggctgggtgg 
atcgccggcg 
ggctacggcg 
cccagcaccg 
gtgggcgtgg 
cagtggatga 
tacgtgcccg 
acccagctgc 
ggcagctggc 
tggctgcaga 
ggctacaagg 
cagatcaccg 
aacacctggc 
cccgccccca 
acccgcgtgg 
tgccaggtgc 
gcccccgcct 
tacctggtgg 
atgctgaccg 
agccccccca 
acctgcacca 
cgccaggaga 
gacagcttcg 
atcctgcgca 
aacccccccc 
tgccccctgc 
gtgctgaccg 
agcagcgaga 
gacgacggcg 
gagcccggcg 
gaggacgtgg 
gccgccgagg 
aacatggtgt 
gaccgcctgc 
gccagcaccg 
cacagcgcca 
gccgtgaacc 
gacaccacca 



tggagacctt 
gcctgagcac 
tcaccagccc 
ccgcccagct 
ccgccgtggg 
ccggcgtggc 
aggacctggt 
tgtgcgccgc 
accgcctgat 
agagcgacgc 
tgaagcgcct 
tgcgcgacgt 
gcaagctgct 
gcgtgtggcg 
gccacgtgaa 
acggcacctt 
actacagccg 
gcgacttcca 
ccgcccccga 
gccgccccct 
gcagccagct 
accccagcca 
gcctggccag 
cccaccacgt 
tgggcggcaa 
accccctgcg 
agagcaagaa 
tgctggagag 
cccccatcaa 



aagcccgccc 
ctgtacgacg 
tacagccccg 
atgggcttca 
gtggaggaga 
agcctgaccg 
ggctaccgcc 
tgctacctga 
gtgaacgccg 
agcctgcgcg 
cagcccgagt 
gacgccagcg 
gccgcctggg 
tacgccccca 



agagcagcgt 
gcagcgccgt 
acaagggcag 
accccgacct 
tgtgctgcag 
agagcaagct 
acgccaccac 
aggtgctgga 
tgaaggccaa 
agagcaagtt 
acatccacag 
tcatggccaa 
gcctgatcgt 
tggtgagcac 
gccagcgcgt 
gctacgacac 
gcatctacca 
agcgcctgta 
gctgccgcgc 
aggccagcgc 
ccggcctggt 
tgttcaccga 
acgacctgga 
gcaagcgcgt 
agaccgcccg 
ccctgtgggc 



ctgggccaag 
cctgcccggc 
cctgaccacc 
ggcccccccc 
cagcatcggc 
cggcgccctg 
gaacc tgctg 
catcctgcgc 
cgccttcgcc 
cgccgcccgc 
gcaccagtgg 
gtgggactgg 
gccccagctg 
cggcgacggc 
gaacggcagc 
ccccatcaac 
cgccctgtgg 
ctacgtgacc 
gttcttcacc 
gctgcgcgag 
gccctgcgag 
catcaccgcc 
cagcagcgcc 
gagccccgac 
catcacccgc 
cgccgaggag 
gttccccgcc 
ctggaaggac 
ggcccccccc 
gagcagcgcc 
ggacagcggc 
cgacgtggag 
gagcgacggc 
catgagctac 
gcccatcaac 
cagccgcagc 
cgaccactac 
gctgctgagc 
cggctacggc 
cgtgtggaag 
gaacgaggtg 
gttccccgac 
cctgccccag 
ggagttcctg 
ccgctgcttc 
gtgctgcgac 
catcggcggc 
cagcggcgtg 
cgcctgccgc 
ggtgatctgc 
ggccatgacc 
gctgatcacc 
gtactacctg 
ccacaccccc 
ccgcatgatc 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ctgatgaccc 
tgccagatct 
cgcctgcacg 
gtggccagct 
cgcagcgtgc 
ctgttcaact 
ctggacctga 
agccgcgccc 
atctacctgc 



acttcttcag 
acggcgcctg 
gcctgagcgc 
gcctgcgcaa 
gcgcccgcct 
gggccgtgaa 
gcggctggtt 
gcccccgctg 
tgcccaaccg 



catcctgctg 
ctacagcatc 
cttcagcctg 
gctgggcgtg 
gctgagccag 
gaccaagctg 
cgtggccggc 
gttcatgctg 
ctaaa 



gcccaggagc 
gagcccctgg 
cacagctaca 
ccccccctgc 
ggcggccgcg 
aagctgaccc 
tacagcggcg 
tgcctgctgc 



agctggagaa 
acctgcccca 
gccccggcga 
gcgtgtggcg 
ccgccacctg 
ccatccccgc 
gcgacatcta 
tgctgagcgt 



ggccctggac 
gatcatcgag 
gatcaaccgc 
ccaccgcgcc 
cggcaagtac 
cgccagccag 
ccacagcctg 
gggcgtgggc 



5520 

5580; 

5640 = 

5700 

5760 

5820 

5880 

5940 

5965 



<210> 4 
<211> 37090 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MRKAd6-NSihut nucleic acid 



<400> 4 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 
ttgtgacgtg. gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 
gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 180 
gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 240 
taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 
agtgaaatct gaataatttt , gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 
gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc - 420 
cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 480 
catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 540 
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 600 
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 660 
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 720 
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt . 780 
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatc . 840. 
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 900 
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 960 

actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 1020 

aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 1080 

gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 1140 

cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 1200 

tccgcggccg ggaacggtgc attggaacgc ggattccccg tgccaagagt gagatctgcc 1260 

accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 1320 

actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 1380 

gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 1440 

gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 1500 

gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 1560 

tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 1620 

cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 1680 

tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta 1740 

tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 1800 

atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 1860 

gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 1920 

gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 1980 

ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 2040 

attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 2100 
tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact . 21.60 



10/64 



wo 03/031588 



PCTAJS02/32512 



acaatcttgg 
. gtgctcgcca 
: gtggccctgt 
atcagggggg 
gcaaagctgt 
gtcataccaa 
acgggcgact 
agcttggatc 
tcgcagcggc 
ggagaacggc 
tgtgcttggt 
acaccagggt 
ctcacccaca 
tacctggtag 
gatcaaatgt 
ctgtacaggc 
atcatggcat 
ggagtccttg 
aggattatct 
ttcgatgaaa 
gccgagcaat 
. gctgctgctc 
atgtggaatt 
cccgcaatag 
agtaccctcc 
gccgcttcgg 
gggaaggtgc 
gccttcaagg 
gccatcctct 
cacgtgggtc 
cggggtaatc 
actcagatcc 
aatgaagact 
tgcacggtgt 
ggagtccctt 
atgcaaacca 
aggatcgtcg 
tacaccaicgg 
gtggccgctg 
atgaccactg 
gtggacggag 
gttacattcc 
gaaccggatg 
acggctaagc 
cagttgtctg 
gacctcatcg 
gagtcggaga 
gagagggaag 
atgcccatct 
gactacgtcc 
ccacctccac 
gcggagctcg 
gcgaccgccc 
tactcctcca 
tggtctaccg 



gcatcggcac 
ccgctacgcc 
ctaatactgg 
gaaggcatct 
caggcctcgg 
ctatcggaga 
ttgactcagt 
ccaccttcac 



ggggtaggac 
cctcgggcat 
acgagctcac 
tgcccgtttg 
tagatgcaca 
cataccaagc 
ggaagtgtct 
tgggagccgt 
gcatgtcggc 
cagctctggc 
tgtccgggag 
tggaagagtg 
tcaagcagaa 
ccgtggtgga 
tcatcagcgg 
catcattgat 
tgtttaacat 
ctttcgtggg 
ttgtggacat 
tcatgagcgg 
ctcctggcgc 
cgggagaggg 
atgtttcccc 
tctccagcct 
gctccacacc 
tgactgactt 
ttttctcgtg 
cctgcccatg 
ggcctaagac 
gcccctgcac 
aggagtacgt 
acaacgtaaa 
tgcggttgca 
aggtcgggct 
tagcagtgct 
gtaggttggc 
cgccttcctt 
aggccaacct 
acaaggtggt 
tatccgttcc 
gggcgcgccc 
ctccggtggt 
ggagaaagag 
ctactaagac 
ttcctgacca 
tgccccccct 
tgagcgagga 



agtcctggac 
tccgggatcg 
agagatcccc 
cattttctgt 
aatcaacgct 
cgtcgttgtc 
gatcgactgt 
cattgagacg 
tggcaggggt 
gttcgattcc 
ccccgccgag 
ccaggaccac 
cttcttgtcc 
cacggtgtgc 
catacggctg 
ccaaaatgag 
tgacctggag 
cgcgtattgc 
gccggctatt 
cgcctcgcac 
agcgctcggg 
gtccaagtgg 
gatacagtac 
ggcattcaca 
cttggggggg 
cgccggcatc 
tctggcgggt 
cgagatgccc 
cctggtcgtc 
ggctgtgcag 
cacgcactat 
taccatcact 
gtgttccggc 
caagacctgg 
ccaacgcggg 
tggagcacag 
ctgcagcaac 
accctctcca 
ggaggtcacg 
gtgcccatgc 
caggtacgct 
caaccaatac 
cacttccatg 
cagggggtct 
gaaggcgaca 
cctgtggcgg 
agtcctggac 
ggcggagatc 
ggattacaac 
gcacgggtgc 
gacggttgtc 
cttcggcagc 
ggcctccgac 
tgagggggaa 
agctagtgag 



caagcggaga 
gtcaccgtgc 
ttctatggca 
cattccaaga 
gtggcgtatt 
gtggcaacag 
aacacatgtg 
acgaccgtgc 
aggagaggca 
tcggtcctgt 
acctcggtta 
ctggagttct 
cagaccaagc 
gccagggctc 
aaacctacgc 
gtcaccctca 
gtcgtcacta 
ctgacaacag 
gttcccgaca 
ctcccttaca 
ttactgcaaa 
cgagcccttg 
ttagcaggct 
gcctctatca 
tgggtggctg 
gccggtgcgg 
tatggagcag 
tccaccgagg 

ggggtcgtgt 

tggatgaacc 
gtgcctgaga 
cagctgctga 
tcgtggctaa 
ctccagtcca 
tacaagggag 
atcaccggac 
acgtggcatg 
gcgccaaact 
cgggtggggg 
caggttccgg 
ccggcgtgca 
ctggttgggt 
ctcaccgacc 
cccccctcct 
tgcactaccc 
caggagatgg 
tctttcgacc 
ctgcggaaat 
cctccactgt 
.ccgttgccac 
ctaacagagt 
tccgaatcat 
gacggtgaca 
ccgggggacc 
gatgtcgtct 



cggctggagc 
cacacccaaa 
aagccatccc 
agaagtgcga 
accgggggct 
acgctctgat 
tcacccagac 
ctcaagacgc 
tctacaggtt 
gtgagtgcta 
ggttgcgggc 
gggagagtgt 
aggcaggaga 
aggccccacc 
tgcacgggcc 
cccaccccat 
gcacctgggt 
gcagtgtggt 

gggagtttct 

tcgagcaggg 
cagccaccaa 
agacattctg 
tatccactct 
ccagcccgct 
cccaactcgc 
ctgttggcag 
gagtggccgg 
acctggtcaa 
gtgcagcaat 
ggctgatagc 
gcgacgccgc 
aaaggctcca 

gggatgtttg 
agctcctgcc 
tctggcgggg 
atgtcaaaaa 
gaacattccc 
attctagggc 
atttccacta 
ctcctgaatt 
ggcctctcct 
cacagctacc 
cctcccacat 
tggccagctc 
accatgtctc 
gcgggaacat 
cgcttcgagc 
ccaagaagtt 
tagagtcctg 
ctatcaaggc 
cctccgtgtc 
cggccgtcga 
aaggatccga 
ccgatctcag 
gctgctcaat 



gcggcttgtc 
catcgaggag 
cattgaagcc 
cgagctcgcc 
cgatgtgtcc 
gacgggctat 
agtcgacttc 
agtgtcgcgc 
tgtgactccg 
tgacgcgggc 
ctacctgaac 
cttcacaggc 
caacttcccc 
tccatcatgg 
aacacccttg 
aaccaaatac 
gctggtgggc 
cattgtgggt 
ctaccaggag 
aatgcagctc 
acaagcggag 
ggcgaagcac 
gcctgggaac 
caccacccaa 
cccccccagc 
cataggcctt 
cgcgctcgtg 
tctacttcct 
actgcgtcga 
gttcgcctcg 
agcgcgtgtt 
ccagtggatt 
ggactggata 
gcagctaccg 
agacggcatc 
cggttccatg 
catcaacgca 
gctgtggcgg 
cgtgacgggc 
cttcacggag 
acgggaggag 
atgcgagccc 
cacagcagaa 
ttcagctagc 
tccggacgct 
cacccgcgtg 
ggaggaggat 
ccccgcagcg 
gaaggacccg 
ccctccaata 
ttctgcctta 
cagcggcacg 
cgttgagtcg 
tgacgggtct 
gtcctacaca 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 . 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 5520 

ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 5580" 

ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 5640 

gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 57 00 

gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 5760 

aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 5820 

ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 5880 

tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 5940 

ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 6000 

gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 6060 

aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 6120 

tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 6180 

gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 6240 

ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 6300^ 

acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 6360 

gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 6420 

agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 6480 

tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 6540 

tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 6600. 

cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 6660 

aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 6720 

atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 6780 

cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat. cattgaacga 6840 

ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 6900 

gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 6960 

agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 7020 

ttcaactggg cagCgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 7080 

gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 7140 

cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 7200 

tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 7260 

ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 7320 

ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 7380 

ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 7440 

ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 7500 

ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 7560 

gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 7620 

atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 7680 

gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 7740 

actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 7800 

tttgcttccc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 7860 

aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 7 920 

cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 7980 

gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 8040 

tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 8100 

ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 8160 

atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 8220 

gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 8280 

ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 8340 

agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 8400 

gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 8460 

tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 8520 

gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 8580 

ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 8640 

aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 8700 

ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct ^ 8760 
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ttgagttcag 
gtaggggaga 
gtgggcccgt 
ccgtcatccc 
ctgaccaaat 
aagtttttca 
agttccaggc 
cctcgtttcg 
gggccagggt 
tgaaggggtg 
tgctgaagcg 
catagtccag 
. cgcacgaggg 
ccggggagta 
tgagctctgg 
tacctctggt 
. cgtatacaga 
' actcggacca 
aggggtagcg 
cgccctcttc 
ttcctgaagg 
cgctgtctgc 
ctgcgctaag 
tgatgccttt 
gcttggtggc 
tttggttttt 
gcgcaacgca 
gccaaccgcg 
gctcgttggt 
gctgcgtctc 
cgaagtagtc 
gcgcgcgctc 
cgtacatgcc 
ggtagcatct 
cgaggaggtc 
tgaagatggc 
ctgtgagacc 
gctcggcggt 
acttatcctg 
tccagtactc 
actggttgac 
cggccttccg 
actggtattt 
gctttttgga 
cgcgaggcat 
ttacctgggc 
gttccaagaa 
gctcttcagg 
aagcgacgaa 
tcctaaactg 
cttgttccca 
gctcatctcc 
ccatccaagt 
agccgatcgg 
gaaagtagaa 



atggggggat 
tcagctggga 
aaatcacacc 
tgagcagggg 
ccgccagaag 
acggtttgag 
ggtcccacag 
cgggttgggg 
catgtctttc 
cgctccgggc 
ctgccggtct 
cccctccgcg 
gcagtgcaga 
ggcatccgcg 
ccgttcgggg 
ttccatgagc 
cttgagaggc 
ctctgagacg 
gtcgttgtcc 
ggcatcaagg 
ggggctataa 
gagggccagc 
attgtcagtt 
gagggtggcc 
aaacgacccg 
gtcgcgatcg 
ccgccattcg 
gttgtgcagg 
ccagcagagg 
gtccgggggg 
tatcttgcat 
gtatgggttg 
gcaaatgtcg 
tccaccgcgg 
gggaccgagg 
atgtgagttg 
taccgcgtca 
gacctgcacg 
tccctttttt 
ttggatcgga 
ggcctggtag 
gagcgaggtg 
gaagtcagtg 
acgcgggttt 
aaagttgcgt 
ggcgagcacg 
gcgcgggatg 
ggagctgagc 
tgagctccac 
gcgacctatg 
gcggtcccat 
gccgaacttc 
ataggtctct 
gaagaactgg 
gtccctgcga 



catgtctacc 
agaaagcagg 
tattaccggc 
ggccacttcg 
gcgctcgccg 
accgtccgcc 
ctcggtcacc 
cggctttcgc 
cacgggcgca 
tgcgcgctgg 
tcgccctgcg 
gcgtggccct 
cttttgaggg 
ccgcaggccc 
tcaaaaacca 
cggtgtccac 
ctgtcctcga 
aaggctcgcg 
actagggggt 
aaggtgattg 
aagggggtgg 
tgttggggtg 
tccaaaaacg 
gcgtccatct 
tagagggcgt 
gcgcgctcct 
ggaaagacgg 
gtgacaaggt 
cggccgccct 
tctgcgtcca 
ccttgcaagt 
agtgggggac 
taaacgtaga 
atgctggcgc 
ttgctacggg 
gatgatatgg 
cgcacgaagg 
tctagggcgc 
ttecacagct 
aacccgtcgg 
gcgcagcatc 
tgggtgagcg 
tcgtcgcatc 
ggcagggcga 
gtgatgcgga 
atctcgtcaa 
cccttgatgg 
ccgtgctctg 
aggtcacggg 
gccatttttt 



ccaaggtccg 
atgaccagca 
acatcgtagg 
atctcccgcc 
cgggccgaac 



tgcggggcga 
ttcctgagca 
tgcaactggt 
ttaagcatgt 
cccagcgata 
gtaggcatgc 
tgctctacgg 
tgtacggcag 
gggtcctcgt 
ccagggtgcg 
cgtcggccag 
tggcgcgcag 
cgtagagctt 
cgcagacggt 
ggtttccccc 
gctcggtgac 
gcggtgttcc 
tccaggccag 
ccactcgctc 
gtttataggt 
gggcgcgttc 
agtactccct 
aggaggattt 
ggtcagaaaa 
tggacagcaa 
tggccgcgat 
tggtgcgctc 
caacgctggt 
tgcgcgagca 
cggtaaagac 
ctagcgcctg 
cccatggcat 

ggggctctct 

gcacgtaatc 
cgggctgctc 
ttggacgctg 
aggcgtagga 
agtagtccag 
cgcggttgag 
cctccgaacg 
ccttttctac 
caaaggtgtc 
cgccctgctc 
aggtgacatc 
agggtcccgg 
agccgttgat 
aaggcaattt 
aaagggccca 
ccattagcat 
ctggggtgat 
cggctaggtc 
tgaagggcac 
tgacaaagag 
accagttgga 
actcgtgctg 



tgaagaaaac 
gctgcgactt 
agttaagaga 
ccctgactcg 
gcagttcttg 
ttttgagcgt 
catctcgatc 
tagtcggtgc 
cagcgtagtc 
cttgaggctg 
gtagcatttg 
cttgcccttg 
gggcgcgaga 
ctcgcattcc 
atgctttttg 
gaaaaggctg 
gcggtcctcc 
cacgaaggag 
cagggtgtga 
gtaggccacg 
gtcctcactc 
ctcaaaagcg 
gatattcacc 
gacaatcttt 
cttggcgatg 
gtttagctgc 
gtcgggcact 
ggctacctct 
gaatggcggt 
cccgggcagc 
ctgccatgcg 

ggggtgggtg 
gagtattcca 
gtatagttcg 
tgctcggaag 
gaagacgttg 
gtcgcgcagc 
ggtttccttg 
gacaaactct 
gtaagagcct 
gggtagcgcg 
cctaaccatg 
ccagagcaaa 
gttgaagagt 
cacctcggaa 
gttgtggccc 
tttaagttcc 
gtctgcaaga 
ttgcaggtgg 
gcagtagaag 
tcgcgcggcg 
gagctgcttc 
acgctcggtg 
ggagtggctg 
gcttttgtaa 



ggtttccggg 
accgcagccg 
gctgcagctg 
catgttttcc 
caaggaagca 
ttgaccaagc 
cagcatatct 
tcgtccagac 
tgggtcacgg 
gtcctgctgg 
accatggtgt 
gaggaggcgc 
aataccgatt 
acgagccagg 
atgcgtttct 
tccgtgtccc 
tcgtatagaa 
gctaagtggg 
agacacatgt 
tgaccgggtg 
tcttccgcat 
ggcatgactt 
tggcccgcgg 
ttgttgtcaa 
gagcgcaggg 
acgtattcgc 
aggtgcacgc 
ccgcgtaggc 
agtgggtcta 
aggcgcgcgt 
cgggcggcaa 
agcgcggagg 
agatatgtag 
tgcgagggag 
actatctgcc 
aagctggcgt 
ttgttgacca 
atgatgtcat 
tcgcggtctt 
agcatgtaga 
tatgcctgcg 
actttgaggt 
aagtccgtgc 
atctttcccg 
cggttgttaa 
acaatgtaaa 
tcgtaggtga 
tgagggttgg 
tcgcgaaagg 
gtaagcgggt 
gtcactagag 
ccaaaggccc 
cgaggatgcg 
ttgatgtggt 
aaacgtgcgc 



8820 
8880 
8940 
9000 
9060 
9120 
9180 
, 9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
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agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca . .12120 

caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta 12180 

cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 12240 

ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 12300 

catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 12360 

gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 12420 

tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaggccg catccccgcg 12480 

gcgcgactac ggbaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 12540 

ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 12600 

agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 12660 

gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 12720 

gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 12780 

gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 12840 

ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt .12900 

ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 12960 

gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 13020 

cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 13080 

gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 13140 

cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 13200 

ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 13260 

gatgagctcg gcgacagtgt cgcgcacctc gcgctcaaag gctacagggg cctcttcttc 13320 

ttcttcaatc tcctcttcca taagggcctc cccttcttct tcttctggcg gcggtggggg 13380 

aggggggaca cggcggcgac gacggcgcac cgggaggcgg tcgacaaagc gctcgatcat 13440 

ctccccgcgg cgacggcgca tggtctcggt gacggcgcgg ccgttctcgc gggggcgcag 13500 

ttggaagacg ccgcccgtca tgtcccggtt atgggttggc ggggggctgc cgtgcggcag 13560 

ggatacggcg ctaacgatgc atctcaacaa ttgttgtgta ggtactccgc caccgaggga 13620 

cctgagcgag tccgcatcga ccggatcgga aaacctctcg agaaaggcgt ctaaccagtc 13680 

acagtcgcaa ggtaggctga gcaccgtggc gggcggcagc gggcggcggt cggggttgtt 13740 

tctggcggag gtgctgctga tgatgtaatt aaagtaggcg gtcttgagac ggcggatggt 13800 

cgacagaagc accatgtcct tgggtccggc ctgctgaatg cgcaggcggt cggccatgcc 13860 

ccaggcttcg ttttgacatc ggcgcaggtc tttgtagtag tcttgcatga gcctttctac 13920 

cggcacttct tcttctcctt cctcttgtcc tgcatctctt gcatctatcg ctgcggcggc 13980 

ggcggagttt ggccgtaggt ggcgccctct tcctcccatg cgtgtgaccc cgaagcccct 14040 

catcggctga agcagggcca ggtcggcgac aacgcgctcg gctaatatgg cctgctgcac 14100 

ctgcgtgagg gtagactgga agtcgtccat gtccacaaag cggtggtatg cgcccgtgtt 14160 

gatggtgtaa gtgcagttgg ccataacgga ccagttaacg gtctggtgac ccggctgcga 14220 

gagctcggtg tacctgagac gcgagtaagc ccttgagtca aagacgtagt cgttgcaagt 14280 

ccgcaccagg tactggtatc ccaccaaaaa gtgcggcggc ggctggcggt agaggggcca 14340 

gcgtagggtg gccggggctc cgggggcgag gtcttccaac ataaggcgat. gatatccgta 14400 

gatgtacctg gacatccagg tgatgccggc ggcggtggtg gaggcgcgcg gaaagtcacg . 14460 

gacgcggttc cagatgttgc gcagcggcaa aaagtgctcc atggtcggga cgctctggcc 14520 

ggtcaggcgc gcgcagtcgt tgacgctcta gaccgtgcaa aaggagagcc tgtaagcggg 14580 

cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg accggggttc 14640 

gaaccccgga tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt gtcgaaccca 14700 

ggtgtgcgac gtcagacaac gggggagcgc tccttttggc ttccttccag gcgcggcgga 14760 

tgctgcgcta gcttttttgg ccactggccg cgcgcggcgt aagcggttag gctggaaagc 14820 

gaaagcatta agtggctcgc tccctgtagc cggagggtta ttttccaagg gttgagtcgc 14880 

gggacccccg gttcgagtct cgggccggcc ggactgcggc gaacgggggt ttgcctcccc 14940 

gtcatgcaag accccgcttg caaattcctc cggaaacagg gacgagcccc ttttttgctt 15000 

ttcccagatg catccggtgc tgcggcagat gcgcccccct cctcagcagc ggcaagagca 15060 

agagcagcgg cagacatgca gggcaccctc cccttctcct accgcgtcag gaggggcaac 15120 

atccgcggct gacgcggcgg cagatggtga ttacgaaccc ccgcggcgcc ggacccggca . 15180 

ctacttggac ttggaggagg gcgagggcct ggcgcggcta ggagcgccct ctcctgagcg 15240 

acacccaagg gtgcagctga agcgtgacac gcgcgaggcg tacgtgccgc ggcagaacct .15300 

gtttcgcgac cgcgagggag aggagcccga ggagatgcgg gatcgaaagt tccatgcagg 153 60 



14/64 



wo 03/031588 



PCT/US02/32512 



gcgcgagttg 
cgacgcgcgg 
cgcgtacgag 
gcgcacgctt 
aagcgcgctg 
gcagcacagc 
gggccgctgg 
cttgagcctg 
ttacgcccgc 

ggggttctac 

tcgcaacgag 
cgagctgatg 
cgagtcctac 
ggcagctggg 
cgtggaggaa 
gatgtttctg 
agccagccgt 
tcgctgactg 
gcaattctgg 
atcgtaaacg 
gacgcgctgc 
cggctggtgg 
aacctgggct 
cggggacagg 
ccgcaaagtg 
ctgcagaccg 
gctcccacag 
. ctgctgctaa 
cacttgctga 
caggagatta 
accctgaact 
agcgaggagg 
gacggggtaa 
tatgcctcaa 
gtgaaccccg 
ttctacaccg 
gacgacagcg 
gcagaggcgg 
gctgcggccc 
agcactcgca 
ctgcagccgc 
ctagtggaca 
ccgcgcccgc 
gatgactcgg 
caccttcgcc 
tcaccaaggc 
ggcgatgtat 
ggcggcggcg 
gtacctgcgg 
cgacaccacc 
ccagaacgac 
ggaggcaagc 
aaccatcctg 
ggcgcgggtg 
gtgggtggag 



cggcatggcc 
accgggatta 
cagacggtga 
gtggcgcgcg 
gagcaaaacc 
agggacaacg 
ctgctcgatt 
gctgacaagg 
aagatatacc 
atgcgcatgg 
cgcatccaca 
cacagcctgc 
tttgacgcgg 
gccggacctg 
tatgacgagg 
atcagatgat 
ccggccttaa 
cgcgcaaccc 
aagcggtggt 
cgctggccga 
ttcagcgcgt 
gggatgtgcg 
ccatggttgc 
aggactacac 
aggtgtatca 
taaacctgag 
gcgaccgcgc 
tagcgccctt 
cactgtaccg 
caagtgttag 
acctgctgac 
agcgcatttt 
cgcccagcgt 
accggccgtt 
agtatttcac 

ggggattcga 

tgttttcccc 
cgctgcgaaa 
cgcggtcaga 
ccacccgccc 
agcgcgaaaa 
agatgagtag 
ccacccgtcg 
cagacgacag 
ccaggctggg 
catggcaccg 
gaggaaggtc 
ctgggttcac 
cctaccgggg 
cgtgtgtacc 
cacagcaact 
acacagacca 
cataccaaca 
atggtgtcgc 
ttcacgctgc 



tgaaccgcga 
gtcccgcgcg 
accaggagat 
aggaggtggc 
caaatagcaa 
aggcattcag 
tgataaacat 
tggccgccat 
atacccctta 
cgctgaaggt 
aggccgtgag 
aaagggccct 
gcgctgacct 
ggctggcggt 
acgatgagta 
gcaagacgca 
ctccacggac 
tgacgcgttc 
cccggcgcgc 
aaacagggcc 
ggctcgttac 
cgaggccgtg 
actaaacgcc 
caactttgtg 
gtccgggcca 
ccaggctttc 
gaccgtgtct 
cacggacagt 
cgaggccata 
ccgcgcgctg 
caaccggcgg 
gcgctatgtg 
ggcgctggac 
tatcaatcgc 
caatgccatc 
ggtgcccgag 
gcaaccgcag 
ggaaagcttc 
tgctagtagc 
gcgcctgctg 
gaacctgcct 
atggaagacg 
tcaaaggcac 
cagcgtcttg 
gagaatgttt 
agcgttggtt 
ctcctccctc 
ccttcgatgc 
ggagaaacag 
ttgtggacaa 
ttctaaccac 
tcaatcttga 
tgccaaatgt 
gctcgcttac 
ccgagggcaa 



gcggttgctg 
cgcacacgtg 
taactttcaa 
tataggactg 
gccgctcatg 
ggatgcgctg 
tctgcagagc 
taactattcc 
cgttcccata 
gcttaccttg 
cgtgagccgg 
ggctggcacg 
gcgctgggcc 
ggcacccgcg 
cgagccagag 
acggacccgg 
gactggcgcc 
cggcagcagc 
gcaaacccca 
atccggcccg 
aacagcagca 
gcgcagcgtg 
ttcctgagta 
agcgcactgc 
gactattttt 
aagaacttgc 
agcttgctga 
ggcagcgtgt 
ggtcaggcgc 
gggcaggagg 
caaaaaatcc 
cagcagagcg 
atgaccgcgc 
ctaatggact 
ttgaacccgc 
ggtaacgatg 
accctgctag 
cgcaggccaa 
ccatttccaa 
ggcgaggagg 
ccggcgtttc 
tatgcgcagg 
gaccgtcagc 
gatttgggag 
taaaaaaaag 
ttcttgtatt 
ctacgagagc 
tcccctggac 
catccgttac 
caagtcaacg 
ggtcattcaa 
cgaccggtcg 
gaacgagttc 
taaggacaaa 
ctactccgag 



cgcgaggagg 
gcggccgccg 
aaaagcttta 
atgcatctgt 
gcgcagctgt 
ctaaacatag 
atagtggtgc 
atgctcagtc 
gacaaggagg 
agcgacgacc 
cggcgcgagc 
ggcagcggcg 
ccaagccgac 
cgcgctggca 
gacggcgagt 
cggtgcgggc 
aggtcatgga 
cgcaggccaa 
cgcacgagaa 
atgaggccgg 
acgtgcagac 
agcgcgcgca 
cacagcccgc 
ggctaatggt 
tccagaccag 
aggggctgtg 
cgcccaactc 
cccgggacac 
atgtggacga 
acacgggcag 
cctcgttgca 
tgagccttaa 
gcaacatgga 
acttgcatcg 
actggctacc 
gattcctctg 
agttgcaaca 
gcagcttgtc 
gcttgatagg 
agtacctaaa 
ccaacaacgg 
agcacaggga 

ggggtctggt 

ggagtggcaa 
catgatgcaa 
ccccttagta 
gtggtgagcg 
ccgccgttcg 
tctgagttgg 
gatgtggcat 
aacaatgact 
cactggggcg 
atgtttacca 
caggtggagc 
accatgacca 



actttgagcc 
acctggtaac 
acaaccacgt 
gggactttgt 
tccttatagt 
tagagcccga 
aggagcgcag 
tgggcaagtt 
taaagatcga 
tgggcgttta 
tcagcgaccg 
atagagaggc 
gcgccctgga 
acgtcggcgg 
actaagcggt 
ggcgctgcag 
ccgcatcatg 
ccggctctcc 
ggtgctggcg 
cctggtctac 
caacctggac 
gcagcagggc 
caacgtgccg 
gactgagaca 
tagacaaggc 

gggggtgcgg 

gcgcctgttg 
atacctaggt 
gcatactttc 
cctggaggca 
cagtttaaac 
cctgatgcgc 
accgggcatg 
cgcggccgcc 
gccccctggt 
ggacgacata 
acgcgagcag 
cgatctaggc 
gtctcttacc 
caactcgctg 
gatagagagc 
tgtgcccggc 
gtgggaggac 
cccgtttgca 
aataaaaaac 
tgcggcgcgc 
cggcgccagt 
tgcctccgcg 
cacccctatt 
ccctgaacta 
acagcccggg 
gcgacctgaa 
ataagtttaa 
tgaaatacga 
tagaccttat 



15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 . 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 
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gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 18720 

cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 18780 

tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 18840 

aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 18900 
gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc . 18960 

cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 19020 

gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 19080 

agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 19140 

tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 19200 

cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 1926Q 

cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 19320 

ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 19360 

atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 19440 

gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 19500 

ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 19560 

ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 19620 

gaaccagatt ttggcgcgcc cgccagcccc caccatcacc accgtcagtg aaaacgttcc 19680 

tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 19740 
gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt . 19800 

ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 19860 

cagcaataac acaggctggg gcctgcgctt cccaagcaag atgtttggcg gggccaagaa 19920 

gcgctccgac caacacccag tgcgcgtgcg cgggcactac cgcgcgccct ggggcgcgca 19980 

caaacgcggc cgcactgggc gcaccaccgt cgatgacgcc atcgacgcgg tggtggagga 20040 

ggcgcgcaac tacacgccca cgccgccgcc agtgtccacc gtggacgcgg ccattcagac 20100 

cgtggtgcgc ggagcccggc gctacgctaa aatgaagaga cggcggaggc gcgtagcacg 20160 

tcgccaccgc cgccgacccg gcactgccgc ccaacgcgcg gcggcggccc tgcttaaccg 20220 

cgcacgtcgc accggccgac gggcggccat gcgagccgct cgaaggctgg ccgcgggtat 20280 

tgtcactgtg ccccccaggt ccaggcgacg agcggccgcc gcagcagccg cggccattag 20340 

tgctatgact cagggtcgca ggggcaacgt gtactgggtg cgcgactcgg ttagcggcct 20400 

gcgcgtgccc gtgcgcaccc gccccccgcg caactagatt gcaataaaaa actacttaga 20460 

ctcgtactgt tgtatgtatc cagcggcggc ggcgcgcatc gaagctatgt ccaagcgcaa 20520 

aatcaaagaa gagatgctcc aggtcatcgc gccggagatc tatggccccc cgaagaagga 20580 

agagcaggat tacaagcccc gaaagctaaa gcgggtcaaa aagaaaaaga aagatgatga 20640 

tgatgatgaa cttgacgacg aggtggaact gttgcacgcg accgcgccca ggcgacgggt 20700 

acagtggaaa ggtcgacgcg taagacgtgt tttgcgaccc ggcaccaccg tagtctttac 20760 

gcccggtgag cgctccaccc gcacctacaa gcgcgtgtat gatgaggtgt acggcgacga 20820 

ggacctgctt gagcaggcca acgagcgcct cggggagttt gcctacggaa agcggcataa 20880 

ggacatgctg gcgttgccgc tggacgaggg caacccaaca cctagcctaa agcccgtgac 20940 

actg'cagcag gtgctgcccg cgcttgcacc gtccgaagaa aagcgcggcc taaagcgcga 21000 

gtctggtgac ttggcaccca ccgtgcagct gatggtaccc aagcgtcagc gactggaaga 2l060 

tgtcttggaa aaaatgaccg tggagcctgg gctggagccc gaggtccgcg tgcggccaat 21120 

caagcaggtg gcaccgggac tgggcgtgca gaccgtggac gttcagatac ccaccaccag 21180 

tagcactagt attgccactg ccacagaggg catggagaca caaacgtccc cggttgcctc 21240 

ggcggtggca gatgccgcgg tgcaggcggc cgctgcggcc gcgtccaaga cctctacgga 21300 

ggtgcaaacg gacccgtgga tgtttcgtgt ttcagccccc cggcgtccgc gccgttcaag 21360 

gaagtacggc gccgccagcg cgctactgcc cgaatatgcc ctacatcctt ccatcgcgcc 21420 

tacccccggc tatcgtggct acacctaccg ccccagaaga cgagcaacta cccgacgccg 21480 

aaccaccact ggaacccgcc gccgccgtcg ccgtcgccag cccgtgctgg ccccgatttc 21540 

cgtgcgcagg gtggctcgcg aaggaggcag gaccctggtg ctgccaacag cgcgctacca 21600 

ccccagcatc gtttaaaagc cggtctttgt ggttcttgca gatatggccc tcacctgccg 21660 

cctccgtttc ccggtgccgg gattccgagg aagaatgcac cgtaggaggg gcatggccgg 21720 

ccacggcctg acgggcggca tgcgtcgtgc gcaccaccgg cggcggcgcg cgtcgcaccg 21780 

tcgcatgcgc ggcggtatcc tgcccctcct tattccactg atcgccgcgg cgattggcgc 21840 

cgtgcccgga attgcatccg tggccttgca ggcgcagaga cactgattaa aaacaagtta 21900 

catgtggaaa aatcaaaata aaagtctgga ctctcacgct cgcttggtcc tgtaactatt 21960 
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ttgtagaatg 
catgggaaac 
ctcgctgtgg 
ctggaacagc 
aaaggtggta 
agtgcaaaat 
. ggccgtggag 
• agaaactctg 
cctgcccacc 
cgtaacgctg 
gtccgccgtt 
gcgatcgttg 
tttgggggtg 
tgtcatgtat 
ccaagatggc 
acgcctcgga 
tcagcctgaa 
accggtctca 
cgtacaaggc 
cgtactttga 
ctgcctacaa 
aaactgcaca 
agsrcgcgaga 
gaataaaaat 



ccggcaaaga 

aatggaacga 

tgaaaccctg 

tggttgaaca 

caaatgccac 

taaacatgga 

ccaaagtcat 

acaattttat 

aggcatcgca 

agcttttgct 

ttgacagcta 

caaattattg 

caactgctgc 

aacgcaatga 

tatggagaaa 

accccaccaa 

tggtggctcc 

acatggacaa 

tgttgttggg 

ccattaaaaa 

atgttaacat 

ttaagtttga 

ccacgctgga 

ccgccgccaa 

catcgcgcaa 

ccccttccct 

ttgacggaac 

ctgttagctg 

cagttgacgg 

tgcagatgtt 

aagaccgcat 



gaagacatca 
tggcaagata 
agcggcatta 
agcacaggcc 
gatggcctgg 
aagattaaca 
acagtgtctc 
gtgacgcaaa 
acccgtccca 
gacctgcctc 
gttgtaaccc 
cggcccgtag 
caatccctga 
gcgtccatgt 
taccccttcg 
gtacctgagc 
taacaagttt 
gcgtttgacg 
gcggttcacc 
catccgcggc 
cgcactggcc 
agtggatgct 
acaggaacaa 
aactaaagaa 
aattttcgca 
agcggatgcc 
ctatggctca 
aaatggtaaa 
aaatgaagtt 
aactccagat 
gcttggacaa 
tggtctcatg 
gttgaacgct 
tgattcaatt 
tgatccagat 
ctttcctctt 
taacggggac 
aataggggtg 
tttcctttac 
tgtggaaata 
tgggcttgta 
cgttaatccc 
aaacggccgc 
cctcctcctc 
ggttctgcag 
cagcatttgt 
agccatgctc 
catgctatat 
ctgggcagca 
gggatcaggc 
cttctatctt 
gccgggcaac 
ggagggctat 
ggccaactac 
gtactcgttc 



actttgcgtc 
tcggcaccag 
aaaatttcgg 
agatgctgag 
cctctggcat 
gtaagcttga 
cagaggggcg 
tagacgagcc 
tcgcgcccat 
cccccgccga 
gtcctagccg 
ccagtggcaa 
agcgccgacg 
cgccgccaga 
atgatgccgc 
cccgggctgg 
agaaacccca 
ctgcggttca 
ctagctgtgg 
gtgctggaca 
cccaagggtg 
caagaacttg 
gctaagaaaa 
ggtctacaaa 
gacaaaactt 
acagcagctg 
tacgctagac 
ttggaaagtc 
aacaatatac 
actcatcttt 
caagcaatgc 
tattacaaca 
gttgtagatt 
ggcgacagaa 
gtcagaatta 
ggtggaattg 
caaggcaata 
ggaaataact 
tccaatattg 
tctgacaacc 
gactgctaca 
tttaaccacc 
tacgtgccct 
ctgccaggct 
agctctctgg 
ctttacgcca 
agaaatgaca 
cccatacccg 
tttcgcggtt 
tacgaccctt 
aatcacacct 
gaccgcctgc 
aacgtagctc 
aatattggct 
ttcagaaact 



actggccccg 
caatatgagc 
ttccgccgtt 
ggacaagttg 
tagcggggtg 
tccccgccct 
tggcgaaaag 
tccctcgtac 
ggctaccgga 
cacccagcag 
cgcgtccctg 
ctggcaaagc 
atgcttctga 
ggagctgctg 
agtggtctta 
tgcagttcgc 
cggtggcgcc 
tccccgtgga 
gtgataaccg 
ggggccctac 
cccccaactc 
acgaagagga 
cccatgtata 
taggaactgc 
ttcaacctga 
gtggaagggt 
ccaccaattc 
aagtcgaaat 
aaccaacagt 
cttataaacc 
caaacagacc 
gcacaggtaa 
tgcaagacag 
caagatactt 
ttgagaacca 
ggattactga 
ctacctggca 
ttgccatgga 
cgctgtacct 
ccaacaccta 
ttaaccttgg 
accgcaatgc 
ttcacattca 
catacacata 
gaaacgacct 
ccttcttccc 
ccaacgacca 
ccaacgccac 
gggccttcac 
actacaccta 
ttaagaaggt 
ttactcccaa 
agtgcaacat 
accagggctt 
tccagcccat 



cgacacggct 
ggtggcgcct 
aagaactatg 
aaagagcaaa 
gtggacctgg 
cccgtagagg 
cgtccgcgac 
gaggaggcac 
gtgctgggcc 
aaacctgtgc 
cgccgcgccg 
acactgaaca 
tagctaacgt 
agccgccgcg 
catgcacatc 
ccgcgccacc 
tacgcacgac 
ccgcgaggat 
tgtgctagac 
ttttaagccc 
gtgcgagtgg 
gaatgaagcc 
tgcccaggct 
cgacgccaca 
accacaagta 
tcttaaaaag 
caacggcgga 
gcaatttttt 
tgtattgtac 
taaaatgggg 
aaattacatt 
catgggtgtc 
aaacacagag 
ttcaatgtgg 
tggaactgag 
cacttttcaa 
aaaagattca 
aattaacctg 
gccagacaag 
cgactacatg 
ggcgcgctgg 
gggcctgcgt 
ggtgccccaa 
tgaatggaac 
tagagttgac 
catggcccac 
gtcctttaat 
caacgtgccc 
acgcttgaag 
ctctggctcc 
ggccattact 
tgagtttgag 
gacaaaggac 
ctacattcca 
gagccggcaa 



cgcgcccgtt 
tcagctgggg 
gcagcaaagc 
atttccaaca 
ccaaccaggc 
agcctccacc 
ccgacaggga 
taaagcaagg 
agcacacacc 
tgccaggccc 
ccagcggtcc 
gcatcgtggg 
gtcgtatgtg 
cgcccgcttt 
tcgggccagg 
gagacgtact 
gtgaccacag 
actgcgtact 
atggcttcca 
tactctggca 
gaacaaaatg 
aatgaagctc 
ccactgtccg 
gtagcaggtg 
ggagaatctc 
acaactccca 
cagggcgtta 
tccacatcca 
agcgaagatg 
gataaaaatg 
gcttttagag 
cttgctggtc 
ctgtcctacc 
aatcaagctg 
gatgagttgc 
gctgttaaaa 
acatttgcag 
aatgccaacc 
ctaaaataca 
aacaagcgag 
tctctggact 
taccgctcca 
aagttttttg 
ttcaggaagg 
ggggctagca 
aacacggcct 
gactaccttt 
atctccatcc 
acaaaggaaa 
ataccatacc 
tttgactctt 
attaagcgct 
tggttcctag 
gaaagctaca 
gtggtggacg 



22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 

24900 

24960 

25020 

25080 

25140 

25200 

25260 
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atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcagget 25320 

tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 25380 

acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 25440 

gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 25500 

tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg .25560 

atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 25620 

tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 25680 

gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 25740 

tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 25800 

ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 25860 

cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 25920 

aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 25980 

ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 26040 

tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 26100 

attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 26160 

ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtiaca 26220 

gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 26280 

cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 26340 

gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 26400 

tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 26460 

gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 26520 

acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 26580 

gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 2 6640 

ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 26700 

ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 26760 

ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 26820 

gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 26880 

gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 26940 

ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 27000 

ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 27060 

accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 27120 

cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 27180 

acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 27240 

cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 27300 

tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 27360 

ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 27420 

. ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 27480 

ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 27540 

ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 27600 

cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 27660 

ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 27720 

tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 27780 

acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 27840 

gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 27900 

ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 27960 

gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 28020 

gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 28080 

acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 28140 

ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 28200 

acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 28260 

aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 28320 

gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 28380 

ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 28440 

ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 28500 

catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 28560 
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ccacctatca 
gccgagcgga 
tcgacgaagt 
. ctctgcaaca 
gtgacaacgc 
. cggcacttaa 
gtgcacgacc 
cagttggcga 
agcgacgcaa 
ggttctttgc 
gccagggcta 
cctaccttgg 
agggcgaggc 
ggcaaacggc 
agaagctgct 
ccgcgcacct 
tgccagactt 
caggaattct 
gtgaatgccc 
cctaccactc 
gctgcaacct 
gtcaaattat 
cggggttgaa 
: aggactacca 
agcttaccgc 
aagcccgcca 
gcgaggagct 
c 1 1 cccagga 
gaggaatact 
gactgggaca 
tcaccctcgg 
gctacaacct 
tgggacacca 
caacaacagc 
ttgcaagact 
gtggccttcc 
Srgcggcagcg 
tctgacaaag 
ggcgcccaac 
tgctatattt 
gcgctccctc 
ggaagacgcg 
cgccctttct 
agcacctgtc 
. ccagccacaa 
catgagcgcg 
aattctcctc 
ttggcccgct 
agacgcccag 
tcacagggtg 
tcagctcaac 
gatcggcggc 
ctcgtcctcg 
gccttcggtt 
tcccaacttt 



catctttttc 



caagcagctg 
gccaaaaatc 
agaaaacagc 
gcgcctagcc 
cctacccccc 
cctggagagg 
tgagcagctg 
gctaatgatg 
tgacccggag 
cgtgcgccag 
aattttgcac 
gcgccgcgac 
catgggcgtg 
aaagcaaaac 
ggcggacatt 
caccagtcaa 
gcccgccacc 
tccgccgctt 
cgacatcatg 
atgcaccccg 
cggtaccttt 
actcactccg 
cgcccacgag 
ctgcgtcatt 
agagtttctg 
caacccaatc 
tggcacccaa 
gggacagtca 
gcctagacga 
tcgcattccc 
ccgctcctca 
ctggaaccag 
gccaaggcta 
9tgggggcaa 
cccgtaacat 
gcagcaacag 
cccaagaaat 
gaacccgtat 
caacagagca 
acccgcagct 
gaggctctct 
caaatttaag 
gtcagcgcca 
atgggacttg 
ggaccccaca 



gaacaggcgg 
gccctggtgt 
gccgaagttc 
cggtcgcccg 
gacgagtcgg 
gctggccgct 
gagccgcgct 
tacttcaacc 
gacgcggtaa 



caaaactgca 
gccttgcggc 
tttgagggtc 
gaaaatgaaa 
gtgctgaaac 
aaggttatga 
gatgcaaact 
gcgcgctggc 
gccgcagtgc 
atgcagcgca 
gcctgcaaaa 
gaaaaccgcc 
tacgtccgcg 
tggcagcagt 
ttgaaggacc 
atcttccccg 
agcatgttgc 
tgctgtgcgc 
tggggtcact 
gaagacgtga 
caccgctccc 
gagctgcagg 
gggctgtgga 
attaggttct 
acccagggcc 
ctacgaaagg 
cccccgccgc 
aaagaagctg 
ggcagaggag 
ggaagcttcc 
ctcgccggcg 
ggcgccgccg 
ggccggtaag 
ccgctcgtgg 
catctccttc 
cctgcattac 
cagcggccac 
ccacagcggc 
cgacccgcga 
ggggccaaga 
gcctgtatca 
tcagcaaata 
cgcgaaaact 
ttatgagcaa 
cggctggagc 
tgatatcccg 
ctattaccac 
accaggaaag 
agatgactaa 
ggcagggtat 
tgagctcctc 
cttcatttac 
ccggaggcat 
ccttttctgg 
aagactcggc 



agatacccct 
agggcgctgt 
ttggacgcga 
gtcactgtgg 
gcagcatcga 
gcacagtcat 
tgcaagaaca 
ttgagacgcg 
ttgttaccgt 
agctagagga 
tttccaacgt 
ttgggcaaaa 
actgcgttta 
gcctggagga 
tatggacggc 
aacgcctgct 
aaaactttag 
ttcctagcga 
gctaccttct 
gcggtgacgg 
tggtctgcaa 
gtccctcgcc 
cgtcggctta 
acgaagacca 
acatccttgg 
gacggggggt 
cgcagcccta 
cagctgccgc 
gttttggacg 
gaggccgaag 
ccccagaaat 
gcactgcccg 
tctaagcagc 
cgcgtgcaca 
gcccgccgct 
taccgtcatc 



gcagaagcaa 
ggcagcagca 
gcttagaaac 
acaagagctg 
caaaagcgaa 
ctgcgcgctg 
acgtcatctc 
ggaaattccc 
tgcccaagac 
ggtcaacgga 
cacacctcgt 
tcccgctccc 
ctcaggggcg 
aactcacctg 
tcttggtctc 
gccccgtcag 
tggaactcta 
acctcccggc 
ggacggctac 



atcctgccgt 
catacctgat 
cgagaagcgc 
agtgctggtg 
ggtcacccac 
gagcgagctg 
aaccgaggag 
cgagcctgcc 
ggagcttgag 
aacgttgcac 
ggagctctgc 
cgtgcttcat 
cttatttctg 
gcgcaacctg 
cttcaacgag 
taaaaccctg 
gaactttatc 
ctttgtgccc 
gcagctagcc 
cctactggag 
ttcacaactg 
tgacgaaaag 
ccttcgcaaa 
atcccgcccg 
ccaattgcaa 
ttacttggac 
tcagcagccg 
cgccgccacc 
aggaggagga 
aggtgtcaga 
cggcaaccgt 
ttcgccgacc 
cgccgccgtt 
agaacgccat 
ttcttctcta 
tctacagccc 
aggcgaccgg 
ggaggaggag 
aggatttttc 
aaaataaaaa 
gatcagcttc 
actcttaagg 
cagcggccac 
acgccctaca 
tactcaaccc 
atccgcgccc 
aataacctta 
accactgtgg 
cagcttgcgg 
aaaatcagag 
cgtccggacg 
gcgatcctaa 
caatttattg 
cactacccgg 
gactgaatga 



gccaaccgca 
atcgcctcgc 
gcggcaaacg 
gaacttgagg 
tttgcctacc 
atcgtgcgcc 
ggcctacccg 
gacttggagg 
tgcatgcagc 
tacacctttc 
aacctggtct 
tccacgctca 
tgctacacct 
aaggagctgc 
cgctccgtgg 
caacagggtc 
ctagagcgtt 
attaagtacc 
aactaccttg 
tgtcactgtc 
cttagcgaaa 
tccgcggctc 
tttgtacctg 
ccaaatgcgg 
gccattaaca 
ccccagtccg 
cgggcccttg 
cacggacgag 
gatgatggaa 
cgaaacaccg 
tcccagcatt 
caaccgtaga 
agcccaagag 
agttgcttgc 
ccatcacggc 
ctactgcacc 
atagcaagac 
cactgcgtct 
ccactctgta 
acaggtctct 
ggcgcacgct 
actagtttcg 
acccggcgcc 
tgtggagtta 
gaataaacta 
accgaaaccg 
atccccgtag 
tacttcccag 
gcggctttcg 
ggcgaggtat 
ggacatttca 
ctctgcagac 
aggagttcgt 
accagtttat 
ccagtggaga 



28620 

28680 

28740 

2880O 

28860 

28920 

28980 

29040 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 

31500 

31560 

31620 

31680 

31740 

31800 

31860 
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ggcagagcaa 

cggctccggt 

cggcgtccgg 

gcgccccctg 

tcctaaccct 

attacttact 

ccctcctccc 

aatgggatgt 

cagatgaaac 

gaaaccggcc 

caagaaagtc 

ggcatgcttg 

tcaaatacaa 

acatccgcgc 

gtggtctctg 

aaacttagca 

acatcagccc 

cttactactg 

ggaaaacttg 

acactaggta 

ggcgcaatag 

gtggatagcg 

gcttttgaca 

tcctcaaacg 

ggagctatgg 

atgggcagca 

tgcagaattg 

caaattttgg 

actctaagca 

tcatcactgg 

tacacttatg 

actgcaaaaa 

cattttacta 

tcattcagtt 

tataccttct 

caacgtgttt 

cccaccacca 

attcaacctg 

cttaaacagc 

ctcctgtcga 

gttcatgtcg 

gggcggcgaa 

agggcggtgg 

ggaatacaac 

ccttgtcctc 

gcacagtacc 

ggcggggacc 

acccctcata 

ctcccggtac 

gctggccaaa 

gtggagagcc 

acaacacagg 

catatcccag 

tcgcacgtaa 

ctccagtatg 



ctgcgcctga 

gagttttgtt 

ctcaccaccc 

ctagtggagc 

ggattacatc 

taaaatcagt 

aactctggta 

caaattcctc 

gcgccagacc 

ctccaactgt 

cccccggagt 

cgctaaaaat 

tcactgtttc 

cccttacagt 

acaacactct 

ttgctaccaa 

ccctctctgc 

caaatggtag 

ggctcaaaat 

ctggtcaggg 

ggtttgatac 

ccggtcctaa 

acaccgcaat 

gaaatcccat 

ttgcaaaact 

taaacaatga 

cttcagataa 

gcactgtttc 

gtgtaaactt 

acaaacagta 

ctgttgggtt 

gtaatattgt 

ttacgctaaa 

ggtcctggaa 

cctacattgc 

atttttcaat 

catagcttat 

ccacctccct 

atcatatcat 

gccaaacgct 

ctgtccagct 

ggagaagtcc 

tgctgcagca 

atggcagtgg 

cgggcacagc 

acaatattgt 

acagaaccca 

aacacgctgg 

catataaacc 

acctgcccgc 

caggactcgt 

cacacgtgca 

ggaacaaccc 

ctcacgttgt 

gtagcgcggg 



cacacctcga ccactgccgc cgccacaagt gctttgcccg 31920 

actttgaatt gcccgaagag catatcgagg gcccggcgca 31980 

aggtagagct tacacgtagc ctgattcggg agtttaccaa 32040 

gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 32100 

aagatcttat tccattcaac taacaataaa cacacaataa 32160 ' 
cagcaaatct ttgtccagct tattcagcat cacctccttt - 32220 

tttcagcagc cttttagctg cgaactttct ccaaagtcta 32280 

atgttcttgt ccctccgcac ccactatctt catattgttg 32340 

gtctgaagac accttcaacc ctgtgtaccc atatgacacg 32400 

gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 32460 

gctttctttg cgtctttcag aacctttggt tacctcacac 32520 

gggcagcggc ctgtccctgg atcaggcagg caaccttaca 32580 

tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 32640 

cagctcaggc gccctaacca tggccacaac ttcgcctttg 32700 

taccatgcaa tcacaagcac cgctaaccgt gcaagactca 327 60 

agagccactt acagtgttag atggaaaact ggccctgcag 32820 

cactgataac aacgccctca ctatcactgc ctcacctcct 32880 

tctggctgtt accatggaaa acccacttta caacaacaat 32940 

tggcggtcct ttgcaagtgg ccaccgactc acatgcacta 33000 

ggttgcagtt cataacaatt tgctacatac aaaagttaca 33060 

atctggcaac atggaactta aaactggaga tggcctctat 33120 

ccaaaaacta catattaatc taaataccac aaaaggcctt 33180 

aacaattaac gctggaaaag ggttggaatt tgaaacagac 33240 

aaaaacaaaa attggatcag gcatacaata taataccaat 33300. 

tggaacaggc ctcagttttg acagctccgg agccataaca 33360 

cagacttact ctttggacaa caccagaccc atccccaaat 3342 0 

agactgcaag ctaactctgg cgctaacaaa atgtggcagt 33480 

agctttggca gtatcaggta atatggcctc catcaatgga 33540 

ggttcttaga tttgatgaca acggagtgct tatgtcaaat 33600 
ttggaacttt agaaacgggg actccactaa cggtcaacca '* 33660 

tatgccaaac ctaaaagctt acccaaaaac tcaaagtaaa 33720 

tagccaggtg tatcttaatg gtgacaagtc taaaccattg 33780 

tggaacagat gaaaccaacc aagtaagcaa atactcaata 33840 

cagtggacaa tacactaatg acaaatttgc caccaattcc 33900 

ccaggaataa agaatcgtga acctgttgca tgttatgttt 33960 

tgcagaaaat ttcaagtcat ttt teat tea gtagtatagc 34020 

actaatcacc gtaccttaat caaactcaca gaaccctagt 34080 

cccaacacac agagtacaca gtcctttctc cccggctggc 34140 

gggtaacaga catattctta ggtgttatat tccacacggt 34200 

catcagtgat gttaataaac tccccgggca gctcgcttaa 34260 

gctgagccac aggctgctgt ccaacttgcg gttgctcaac 34320 

acgcctacat gggggtagag tcataatcgt gcatcaggat 34380 

gcgcgcgaat aaactgctgc cgccgccgct ccgtcctgca 34440 

tctcctcagc gatgattcgc accgcccgca gcataaggcg 34500 

agcgcaccct gatctcactt aagtcagcac agtaactgca 34560 

ttaaaatccc acagtgcaag gcgctgtatc caaagctcat 34620 

cgtggccatc ataccacaag cgcaggtaga ttaagtggcg 34680 

acataaacat tacctctttt ggcatgttgt aattcaccac 34740 

tctgattaaa catggcgcca tccaccacca tcctaaacca 34800 

cggctatgca ctgcagggaa ccgggactgg aacaatgaca 34860 

aaccatggat catcatgctc gtcatgatat caatgttggc 34920 

tacacttcct caggattaca agctcctccc gcgtcagaac 34980 

attcctgaat cagcgtaaat cccacactgc agggaagacc 35040 

gcattgtcaa agtgttacat tcgggcagca gcggatgatc 35100 

tttctgtctc aaaaggaggt agacgatccc tactgtacgg 35160 
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: agtgcgccga 
cgtagtcata 
ggtctcgccg 
- ccaggcgccc 
catccaccac 
acacgggagg 
tccaaaacct 
aactctacag 
aggcaaacgg 
tctataaaca 
aatatatctc 
gcgccctcca 
. agacctgtat 
ttcgcagggc 
cgccaggaac 
taaccagcgt 
tgctcaaaaa 
gcagataaag 
acatgtctgc 
agaagcctgt 
gccggcgtga 
ggtcatgtcc 
cagtgctaaa 
cattacagcc 
tgaaaaaccc 
ttccacagcg 
acaccactcg 
gcgagtatat 
aaaccgcacg 
cgtcacttcc 
cacatacaag 
cacgtcacaa 
attgatgatg 



gacaaccgag 
tttcctgaag 
cttagatcgc 
cctggcttcg 
cgcagaataa 
agcgggaaga 
caaaatgaag 
ccaaagaaca 
ccctcacgtc 
ttccagcacc 
taagcaaatc 
ccttcagcct 
aagattcaaa 
cagctgaaca 
catgacaaaa 
agccccgatg 
atcaggcaaa 
gcaggtaagc 
gggtttctgc 
cttacaacag 
ccgtaaaaaa 
ggagtcataa 
aagcgaccga 
cccataggag 
tcctgcctag 
gcagccataa 
acacggcacc 
ataggactaa 
cgaacctacg 
gttttcccac 
ttactccgcc 
actccacccc 



atcgtgttgg 
caaaaccagg 
tctgtgtagt 
ggttctatgt 
gccacaccca 
gctggaagaa 
atctattaag 
gataatggca 
caagtggacg 
ttcaaccatg 
ccgaatatta 
caagcagcga 
agcggaacat 
taatcgtgca 
gaacccacac 
taagcttgtt 
gcctcgcgca 
tccggaacca 
ataaacacaa 



tcgtagtgtc 
tgcgggcgtg 
agttgtagta 
aaactccttc 



gaaaaacaac 
actggtcacc 
tgtaagactc 
aatagcccgg 
gtataacaaa 
gcaaaatagc 
cagtcagcct 
agctcaatca 
aaaatgacgt 
cccagaaacg 
gttacgtcac 
ctaaaaccta 
ctcattatca 



gccaacctac 
ccatgttttt 
tgaacgcgct 
tttgtaagat 
taaaggctaa 
cccaaataat 
agtccggcca 
atcatgattg 
taacaaaaat 
ggtctgcacg 
tgattatgac 
gcatgggcgg 
aaaaagaaag 
ccacagaaaa 
aataaaataa 
ccttataagc 
gtgattaaaa 
ggtaaacaca 
gggaatacat 
attaatagga 
accctcccgc 
taccagtaaa 
gtcacagtgt 
aacggttaaa 
aaagccaaaa 
ttcccatttt 
cgtcacccgc 
tattggcttc 



atgccaaatg 
acaaacagat 
tatccactct 
atgcgccgct 
acattcgttc 
ttttttattc 
cccctccggt 
gttgcacaat 
acccttcagg 
tctcatctcg 
ttgtaaaaat 
caaaaattca 
accgcgatcc 
gaccagcgcg 
acgcatactc 
cgatataaaa 
cacatcgtag 
agacaccatt 
caaaaaaaca 
ataagacgga 
agcaccaccg 
tcaggttgat 
acccgcaggc 
gagaaaaaca 
tccagaacaa 
aaagaaaacc 
aaaaaagggc 
gtccacaaaa 
aacccacaac 
aagaaaacta 
cccgttccca 
aatccaaaat 



gaacgccgga 
ctgcgtctcc 
ctcaaagcat 
gccctgataa 
tgcgagtcac 
caaaagatta 
ggcgtggtca 
ggcttccaaa 
gtgaatctcc 
ccaccttctc 
ctgctccaga 
ggttcctcac 
cgtaggtccc 
gccacttccc 
ggagctatgc 
tgcaaggtgc 
tcatgctcat 
tttctctcaa 
tttaaacatt 
ctacggccat 
acagctcctc 
tcacatcggt 
gtagagacaa 
cataaacacc 
catacagcgc 
tattaaaaaa 
caagtgcaga 
aacacccaga 
ttcctcaaat 
caattcccaa 
cgccccgcgc 
aaggtatatt 



<210> 5 
<211> 5955 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NS cDNA sequence 

<221> CDS 

<222> (1) . . . (5955) 

<400> 5 

Met Ilf p^^ T^"" ^^"^ ^^"^ <=9g ggc eta ctt ggt 

Met Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 

^ 5 10 15 

rlt T^l r^'' l^"" ^""^ Qtc gag gga 

Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val llu Gly 

2^ 25 30 

gag gtt cag gtg gtt tec acc gca aca caa tec ttc ctg gcg ace tgc 



35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35940 

36000 

36060 

36120 

36180 

36240 

36300 

36360 

36420 

36480 

36540 

36600 

36660 

36720 

36780 

36840 

36900 

36960 

37020 

37080 

37090 



48 



96 



144 
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Glu Val Gin Val Val Ser Thr Ala Thr Gin Seir Phe Leu Ala Thr Cys 
35 40 : 45 ' 

gtc aac ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc 192 
Vai Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys. Thr 
50 55 60 

tta gcc ggc cca aag ggg cca ate ace cag atg tac act aat gtg gad 240 
Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
65 ' 70 75 80 . 

cag gac etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca 288 
Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr • 
85 90 95 

cca tgc acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get 336 
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
iOO 105 110 

gac gtc att ccg gtg egc egg egg ggc gac agt agg ggg age ctg etc 384 
Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 , 125 

tec ccc agg ect gtc tec tac ttg aag ggc tet teg ggt ggt cca ctg 432 
Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 . 14b 

etc tgc ect teg ggg cac get gtg ggc ate tte egg get gcc gta tgc 480 
Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys 
145 150 • 155- * . 160 - 

acc egg ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg 528 
Thir Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
16S 170 175 

gaai act act atg egg tet ccg gtc tte acg gac aac tea tec ccc. ccg 576 
Glii Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 

gee gta ccg cag. tea ttt caa gtg gcc cac eta cac get ccc act ggc 624 
Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
195 200 205 

age ggc aag agt act aaa gtg ccg get gea tat gca gcc caa ggg tac 672 
Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 215 220 

aag gtg etc gtc etc aat ccg tec gtt gee get ace tta ggg ttt ggg 72^. 
Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

gcg tat atg tet aag gca cac ggt att gac ccc aac ate aga act ggg 768 
Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly 
245 250 255 
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gta agg acc att acc aca ggc gcc ccc gtc aca tac tct acc tat ggc 816 
Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
260 265 270 

aag ttt ctt gcc gat ggt ggt tgc tct ggg ggc get tat gac ate ata 864 
Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 
275 280 285 

ata tgt gat gag tgc cat tea act gac teg act aca ate ttg ggc ate 912 
He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He 
290 295 300 

; ggc aca gtc etg gac caa gcg gag aeg get gga gcg egg ctt gtc gtg 960 
Gly Thr val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
?05 310 315 320 

etc gcc acc get aeg cct ccg gga teg gtc acc gtg cca cac cea aac 1008 
Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

ate gag gag gtg gee ctg tct aat act gga gag ate ccc ttc tat ggc 1056 
lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly 
340 345 350 

aaa gcc ate ccc att gaa gcc ate agg ggg gga agg cat etc att ttc 1104 
Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe 
355 360 365 

tgt cat tec aag aag aag tgc gac gag etc gee gca aag ctg tea ggc 1152 
Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
370 375 380 



etc gga ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc 
Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 



aeg ggc tat aeg gge gac ttt gac tea gtg ate gae tgt aac aca tgt 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
420 425 430 



agg act ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga 
Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
465 470 475 480 



1200 



ata cca act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg 1248 
He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 



1296 



gtc acc cag aca gtc gac ttc age ttg gat ccc ace ttc acc att gag 1344 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 440 445 

aeg aeg acc gtg cct caa gac gca gtg teg ege teg cag egg egg ggt 1392 
Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
450 455 460 



1440 
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gaa egg ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc tat . 1488 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

gac gcg ggc tgt get tgg tac gag etc acq ccc gee gag ace teg gtt 1536 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser; Val 
500 505 510 

agg ttg egg gee tac ctg aae aca eca ggg ttg ccc gtt tgc cag gac 1584 
Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

cac ctg gag ttc tgg gag agt gtc ttc aca ggc etc ace cac ata gat 1632 
His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp 
530 535 540 

gca cac ttc ttg tec cag ace aag cag gea gga gac aae ttc ccc tac 1680 
Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

Ctg gta gca tac caa gee acg gtg tgc gee agg get cag gee eca cet 1728 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

eca tea tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cet acg 1776 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr. 
580 585 590 

ctg cac ggg eca aca ccc ttg ctg tac agg ctg gga gee gtc caa aat 1824 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Glri Asii 
595 600 605 

gag gtc ace etc ace cac ccc ata ace aaa tac ate atg gca tgc atg 1872 
Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met 
610 615 620 

teg get gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga 1920 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 

gtc ctt gca get ctg gee gcg tat tgc ctg aca aca ggc agt gtg gtc . 1968 
Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
645 650 655 

att gtg ggt agg att ate ttg tec ggg agg cog get att gtt ccc gac 2016 
lie Val Gly Arg He lie Leu Ser Gly Arg Pro Ala He Val Pro Asp 
660 665 ■ 670 

agg gag ttt etc tac cag gag ttc gat gaa atg gaa gag tgc gee teg 2064 
Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
675 680 685 

cac etc cet tac ate gag cag gga atg cag etc gee gag caa ttc aag 2112 
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His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
«90 , 695 700 

cag aaa gcg etc ggg tta ctg caa aca gcc acc aaa caa gcg gag get 
Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
.705 710 715 720 

t^^ "° ^'='= •^sar cga gcc ctt gag aca ttc tgg 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Tro 
725 730 735 

gcg aag cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc 
Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
740 745 

tta tec act etg cct ggg aac ccc gca ata gca tea ttg atg gca ttc 
Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 
755 760 . 765 

^^'^ "^^^ '^aa acc etc ctg ttt 

f^t '^'^ Thr Thr Gin Ser Thr Leu Leu Phe 

770 775 

aac ate ttg ggg ggg tgg gtg get gcc caa etc gcc ccc ccc age gcc 
Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Xla 

790 795 800 

gpt teg get ttc gtg ggc gee ggc ate gcc ggt gcg get gtt ggc age 
Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 
805 810 815 

ata ggc ctt ggg aag gtg ctt gtg gae att ctg gcg ggt tat gga gca 2496 
He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly III 
820 825 830 

gga gtg gcc ggc gcg etc gtg gee ttc aag gtc atg age ggc gag atg 2544 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser §iy |lu Met 
835 840 845 

ccc tec acc gag gac ctg gtc aat eta ctt cct gcc ate etc tet cct 2592 
Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
850 855 860 

ggc gcc ctg gtc gtc ggg gtc gtg tgt gca gca ata ctg cgt cga cac 2640 
Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg A?g hII 

870 875 880 

w!? T"^ S?^ ^3*= ctg ata gcg 2688 

Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 
885 890 895 

lu"" ""^^ tat gtg cct gag 2736 

Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
900 905 910 



2160 



2208 



2256 



2304 



2352 



2400 



2448 
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age gac gcc gca gcg cgt gtt act cag ate etc tee age ctt acc ate 
Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser Ser Leu Thr lie 
915 920 925 



2784 



aet eag ctg etg aaa agg etc eac cag tgg att aat gaa gac tgc tee 
Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp Cys Ser 
930. 935 940 



2832 



aca cog tgt tec ggc teg tgg eta agg gat gtt tgg gac tgg ata tgc 
Thr Pro Cys Ser Gly Ser Tip Leu Arg Asp Val Trp Asp Trp' lie Cys 
945 950 955 960 



2880 



acg gtg ttg act gac ttc aag acc tgg etc cag tec aag etc ctg ccg 
Thr Val . Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 
965 970 975 



2928 



cag eta ccg gga gtc cet ttt ttc teg tgc caa cgc ggg tac aag gga 
Glh Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 
980 985 990 



2976 



gtc tgg egg gga gac ggc ate atg caa ace ace tgc eea tgt gga gca 
Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 
995 1000 1005 



3024 



cag ate acc gga cat gtc aaa aac ggt tec atg agg ate gtc ggg cet 
Glii lie Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 
1010 1015 1020 



3072 



aag ace tgc age aac acg tgg eat gga aca ttc cec ate aac gca tac 
Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr 
1025 1030 1035 1040 



3120 



acc acg ggc cec tgc aca cec tct eea gcg cca aac tat tct agg gcg 
Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 
1045 1050 1055 



3168 



ctg tgg egg gtg gee get gag gag tac gtg gag gtc acg egg gtg ggg 
Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu. Val Thr Arg Val Gly 
1060 1065 1070 



3216 



gat ttc eac tac gtg acg ggc atg ace act gac aac gta aag tgc eea 
Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Ash Val Lys Cys Pro 
1075 1080 1085 



3264 



tgc cag gtt ccg get cet gaa ttc ttc acg gag gtg gac gga gtg egg 
Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 
1090 1095 1100 



3312 



ttg eac agg tac get ccg gcg tgc agg cet etc eta egg gag gag gtt 
Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 



3360 



aca ttc cag gtc ggg etc aac caa tac etg gtt ggg tea cag eta cca 
Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 
1125 1130 1135 



3408 
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nf^ Z""^ tec atg etc acc gac 

Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 

1145 1150 

ccc tec cac ate aca gca gaa acg get aag egt agg ttg gcc agg ggg 
Pro ser Hxs lie Thr Ala Glu Thr Ala Lys Arg Arg LeS Ala A^g ??J 
^155 1160 1155 

tct ccc ccc tec ttg gcc age tct tea get age eag ttg tet gcg ect 

?f?n''" ^^"^ "-^^ ""'^ Ser Gin Leu Ser Sa Pro 

1170 1175 j^^gQ 

tec ttg aag gcg aca tgc act acc cac cat gtc tct ccg gac get aac 
ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Sa Isp 

1190 1195 1200 

L^U ill lit tV" r'"'' f*"^ ""^^ ""^^ ^"^^ ate 

Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 

1205 1210 1215 

^tZ t^"" Sf^ '"'^ 9tc etg gac tct ttc gac 

Thr Arg Val Glu Ser Glu Asrx Lys Val Val Val Leu Isp Ser Phe Isp 
^220 1225 1230 

llf. r^^ cga gcg gag gag gat gag agg gaa gta tec gtt ccg gcg gag 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ila Glu 
1235 1240 1245 

ate ctg egg aaa tec aag aag ttc ccc gca gcg atg ccc ate tgg gcg 

^^"^r.^^ ^^"^ Ala Ala Met Pro lie Trp Ila 

1250 1255 1260 

cgc ccg gat tac aac cct eca ctg tta gag tec tgg aag gac ccg gac 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Tr| Lyi Asp P^l ?sp 

1270 1275 1280 

tac gtc cct ccg gtg gtg cac ggg tgc ccg ttg cca cct ate aag gcc 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro lie Lys Ala 
1285 1290 1295 

P^o Tif l"^ ^""^ ""^^ S'^*^ =ta aca gag 

Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 
1300 1305 1310 

tec tec gtg tct tct gcc tta gcg gag etc get act aag ace ttc ggc 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 
1315 1320 1325 

age tec gaa tea teg gcc gtc gac age ggc acg gcg acc gcc ett cct 

ff?^®^'* ^^"^ ^^"^ Ser Gly Thr Ala Thr Ala Leu Pro 

"30 1335 j^34Q 

gac cag gcc tec gac gac ggt gac aaa gga tec gac gtt gag teg tac 



3456 



3504 



3552 



3600 



3648 



3696 



3744 



3792 



3840 



3888 



3936 



3984 



4032 



4080 
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Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

tec tec atg cec ccc ctt gag ggg gaa ccg ggg gac ccc gat etc agt 

Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 

1365 1370 1375 



4128 



gac ggg tct tgg tct acc gtg age gag gaa get agt gag gat gtc gtc 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 
1380 1385 1390 



4176 



tgc tgc tea atg tee tac aea tgg aea ggc gee ttg ate acg eca tgc 
Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu lie Thr Pro Cys 
1395 1400 1405 



4224 



get geg gag gaa age aag etg cec ate aac gcg ttg age aae tct ttg 
Ala Ala Glu Glu Ser Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu 
1410 1415 1420 



4272 



etg egc cae eat aae atg gtt tat gee aea aea tct ege age gea ggc 
Leu Afg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 



4320 



etg egg cag aag aag gtc ace ttt gac aga etg caa gtc etg gac gac 
Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
1445 1450 1455 



4368 



cae tac egg gac gtg etc aag gag atg aag gcg aag geg tee aea gtt 
His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1460 1465 1470 



4416 



aag get aaa etc eta tec gta gag gaa gee tgc aag etg acg ccc cca 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 
1475 1480 1485 



4464 



cat teg gee aaa tec aag ttt ggc tat ggg gea aag gac gtc egg aac 
His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn 
• 1490 1495 1500 



4512 



eta tee age aag gee gtt aac cae ate cae tee gtg tgg aag gac ttg 
Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 



4560 



etg gaa gac act gtg aea cca att gac ace acc ate atg gea aaa aat 
Leu Glu Asp Thr Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 
1525 1530 1535 



4608 



gag gtt tte tgt gtc caa cca gag aaa gga ggc cgt aag eca gee egc 
Glu Vai Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
1540 - 1545 1550 



4656 



ctt ate gta tte cca gat etg gga gtc cgt gta tgc gag aag atg gee 
Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 
1555 1560 1565 



4704 
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etc tat gat gtg gtc tec acc ctt act cag gtc gtg atg ggc tec tea 4752 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 
1570 1575 1580 

Sf^ """^^ 9tc gag ttc etg gtg aat 4800 

Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
^585 1590 1595 ^goQ 

acc tgg aaa tea aag aaa aac ecc atg ggc ttt tea tat gac act cge 4848 
3 Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 
1605 1610 1615 

tgt tte gac tea aeg gtc acc gag aae gac ate egt gtt gag gag tea 4896 
Cys Phe Asp ser Thr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser 
1620 1625 1630 

^tt tac caa tgt tgt gac ttg gcc cce gaa gee aga cag gee ata aaa 4944 
lie Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys 
1635 1640 1645 

^cg etc aea gag egg ctt tat ate ggg ggt ect ctg act aat tea aaa 4992 
Ser Leu Thr Glii Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys 
1650 1655 1660 

ggg cag aac tgc ggt tat cge egg tgc cge gcg age ggc gtg etg aeg 5040 
Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 

1670 1675 1680 

h^l 1^*^ f.^^ ^^"^ ^'^^ ^^Sr gee tct gca gee 5088 

Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
1685 1690 1695 

tgt cga get gcg aag etc cag gae tgc aeg atg etc gtg aac gga gac 5136 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp 
1700 1705 1710 

gac ctt gtc gtt ate tgt gaa age gcg gga acc caa gag gac gcg gcg 5184 
Asp Leu Val Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 
1715 1720 1725 

age eta cga gtc tte aeg gag get atg act agg tac tct gee cce ecc 5232 
Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 
1730 1735 1740 

ggg gac ccg cce caa cca gaa tac gac ttg gag ctg ata aea tea tgt 5280 
Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys 
^''^S 1750 1755 1760 

tec tec aat gtg teg gtc gee cae gat gca tea ggc aaa agg gtg tac 5328 
Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 
1765 1770 1775 

tac etc ace egt gat ecc acc ace ecc etc gca egg get gcg tgg gaa 5376 
Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 
1780 1785 1790 
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aca get aga cac act cca gtt aac tec tgg eta ggc aac att ate atg 
Thr Ala Arg His Thr Prp Val Asn Ser Trp Leu Gly Asn lie lie Met 
1795 1800 1805 



5424 



tat gcg ccc act ttg tgg gca agg atg att ctg atg act cac ttc ttc 
Tyr Ala Pro Thr Leu Trp Ala Arg Met lie Leix Met Thr His Phe. Phe 
1810 1815 1820 



5472 



tec ate ctt eta gca cag gag caa ctt gaa aaa gcc ctg gae tgc cag 
Ser lie Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 



5520 



ate tac ggg gcc tgt tac tee att gag cca ctt gae eta cet cag ate 
lie Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Gin lie 
1845 1850 1855 



5568 



att gaa cga etc cat ggc ctt age gca ttt tea etc cat agt tac tc.t 
lie Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 
1860 1865 1870 



5616 



cca ggt gag ate aat agg gtg get tea tgc etc agg aaa ctt ggrg gta 
Pro Gly Glu lie Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 
1875 1880 1885 



5664 



cca ccc ttg cga gtc tgg aga eat egg gee aofg age gte egc get agg 
Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
1890 1895 1900 



5712 



eta ctg tee cag ggg ggg agg gcc gee act tgt ggc aag tac etc ttc ^ 
Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr. Leu Phe 
1905 1910 1915 1920 



5760 



aac tgg gca gtg aag ace aaa etc aaa etc act cca ate ceg get gcg 
Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 



5808 



tec cag ctg gae ttg tee ggc tgg ttc gtt get ggt tac age ggg gga 
Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 
1940 1945 1950 



5856 



gae ata tat cac age ctg tct egt gee cga ccc egc tgg ttc atg ctg 
Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 
1955 1960 1965 



5904 



tgc eta etc eta ctt tct gta ggg gta ggc ate tac ctg etc ccc aac 
Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 



5952 



cga 
Arg 
1985 



5955. 



<210> 6 
<211> 1984 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> NS sequence 
<400> 6 

Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys 

^ , ^ 10 15 

■:■ lie He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 
■ 2° 25 30 

Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 

35 40 45 

Asn Gly val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 

55 60 
Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin 

70 75 80 

Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro 
_ _ 85 90 95 

Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 

100 105 110 

Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 

115 120 125 

Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 

130 135 140 

Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr 

150 155 260 

Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 

165 170 275 

Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala 

180 185 190 

Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser 

195 200 205 

Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys 

^1" 215 220 

Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala 
i 230 235 240 

Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val 

250 255 
Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys 

260 265 270 

Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He 

275 280 285 

Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly 

290 295 300 

Thr val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu 

310 315 
Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He 

325 330 335 

Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys 

340 345 
Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys 

355 360 365 

His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu 
3'0 375 380 
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Gly 


lie 


Asn 


Ala 


Val 


Ala 


Tyr 


Tyr 


Arg Gly 


Leu 


Asp 


Val 


Ser Val 


He 


385 










390 










395 








400 


Pro 


Thr 


He 


Gly 


Asp 


Val 


Val 


Val 


Val 


Ala 


Thr 


Asp 


Ala 


Leu Met 


Thr 










405 










410 








415 




Gly 


Tyr 


Thr 


Gly 


Asp 


Phe 


Asp 


Ser 


Val 


He 


Asp 


Cys 


Asn 


Thr Cys 


Val 








420 










425 










430 




Thr 


Gin 


Thr 


Val 


Asp 


Phe 


Ser 


Leu 


Asp 


Pro 


Thr 


Phe 


Thr 


He Glu 


Thr 






435 










440 










445 






Thr 


Thr 


Val 


Pro 


Gin 


Asp 


Ala 


Val 


Ser 


Arg 


Ser 


Gin 


Arg 


Arg Gly Arg 




450 










455 










460 








Thr 


Gly 


Arg 


Gly 


Arg 


Arg 


Gly 


He 


Tyr Arg 


Phe 


Val 


Thr 


Pro Gly Glu 


465 










470 










475 








480 


Arg 


Pro 


Ser 


Gly 


Met 


Phe 


Asp 


Ser 


Ser 


val 


Leu 


Cys 


Glu 


Cys Tyr 


Asp 










485 










490 








495 




Ala 


Gly 


Cys 


Ala 


Trp 


Tyr 


Glu 


Leu 


Thr 


Pro 


Ala 


Glu 


Thr 


Ser Val 


Arg 








500 










505 










510 




Leu 


Arg 


Ala 


Tyr 


Leu 


Asn 


Thr 


Pro 


Gly Leu 


Pro 


Val 


Cys 


Gin Asp His 






515 










520 










525 






Leu 


Glu 


Phe 


Trp 


Glu 


Ser 


Val 


Phe 


Thr Giy 


Leu 


Thr 


His 


He Asp 


Ala 




530 










535 










540 








His 


Phe 


Leu 


Ser 


Gin 


Thr 


Lys 


Gin 


Ala Gly Asp Asn 


Phe 


Pro Tyr 


Leu 


545 










550 










555 








560 


Val 


Ala 


Tyr 


Gin 


Ala 


Thr 


Val 


Cys 


Ala 


Arg 


Ala 


Gin 


Ala 


Pro Pro 


Pro 










565 










570 








575 




Ser 


Trp 


Asp 


Gin 


Met 


Trp 


Lys 


Cys 


Leu 


He 


Arg 


Leu 


Lys 


Pro Thr 


Leu 








580 










585 










590 




His 


Gly 


Pro 


Thr 


Pro 


Leu 


Leu 


Tyr 


Arg 


Leu 


Gly Ala 


Val 


Gin Asn 


Glu 






595 










600 










605> 






Val 


Thr 


Leu 


Thr 


His 


Pro 


He 


Thr 


Lys 


Tyr 


He 


Met 


Ala 


Cys Met 


Ser 




610 










615 










620 








Ala 


Asp 


Leu 


Glu 


Val 


Val 


Thr 


Ser 


Thr 


Trp 


Val 


Leu 


Val 


Gly Gly Val 


625 










630 










635 








640 


Leu 


Ala 


Ala 


Leu 


Ala 


Ala 


Tyr Cys 


Leu 


Thr 


Thr 


Gly 


Ser 


val Val 


He 










645 










650 








655 




Vai 


Gly 


Arg 


He 


He 


Leu 


Ser Gly 


Arg 


Pro 


Ala 


He 


Val 


Pro Asp 


Arg 








660 










665 










670 




Glu 


Phe 


Leu 


Tyr 


Gin 


Glu 


Phe 


Asp 


Glu 


Met 


Glu 


Glu 


Cys 


Ala Ser 


His 






675 










680 










685 






Leu 


Pro 


Tyr 


He 


Glu 


Gin 


Gly Met 


Gin 


Leu 


Ala 


Glu 


Gin 


Phe Lys 


Gin 




690 










695 










700 








Lys 


Ala 


Leu 


Gly 


Leu 


Leu 


Gin 


Thr 


Ala 


Thr 


Lys 


Gin 


Ala 


Glu Ala 


Ala 


705 










710 










715 








720 


Ala 


Pro 


Val 


Val 


Glu 


Ser 


Lys 


Trp 


Arg 


Ala 


Leu 


Glu 


Thr 


Phe Trp 


Ala 










725 










730 








735 




Lys 


His 


Met 


Trp 


Asn 


Phe 


He 


Ser 


Gly 


He 


Gin 


Tyr 


Leu 


Ala Gly 


Leu 








740 










745 










750 




Sef 


Thr 


Leu 


Pro 


Gly 


Asn 


Pro 


Ala 


He 


Ala 


Ser 


Leu 


Met 


Ala Phe 


Thr 






755 










760 










765 






Ala 


Ser 


He 


Thr 


Ser 


Pro 


Leu 


Thr 


Thr 


Gin 


Ser 


Thr 


Leu 


Leu Phe 


Asn 




770 










775 










780 








lie 


Leu 


Gly 


Gly 


Trp 


Val 


Ala 


Ala 


Gin 


Leu 


Ala 


Pro 


Pro 


Sef Ala 


Ala 


785 










790 










795 








800 


Ser 


Ala 


Phe 


val 


Gly 


Ala 


Gly He 


Ala Gly Ala Ala Val 


Gly Ser 


He 










805 










810 








815- 




Gly 


Leu 


Gly 


Lys 


Val 


Leu 


val 


Asp 


He 


Leu Ala Gly Tyr 


Gly Ala 


Gly 
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^20 S25 830 

Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Set Gly Glu Met Pro 

840 845 
Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu' Ser Pro Gly 
' , 855 860 

Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 

875 ftftft 
Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe 
«, „ 890 895 

Ala ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
. ' 905 910 

Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He Thr 

, 920 925 

Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser Thr 

Pro cys ser Gly Ser Trp Leu Arg Asp Val Trp Isp Trp He Cys Thr 

950 

val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro Gin 

965 970 
Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly Val 

. 985 990 
Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala Gin 

loOd 1005 

TJfrt*^^^ ""^^ Arg He Val Gly Pro Lys 

1010 1015 1020 

Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr Thr 

1030 1035 lOAn 

Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala Leu 
„. , -^^^^ 1050 1055 

Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly Asp 

1065 1070 
Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro Cys 
1075 1080 1085 

Tnin^""" val Asp Gly Val Arg Leu 

1090 1095 1100 

His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val Thr 

1110 1115 
Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro Hl^ 

^^^^ 1130 1135 

Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro 

^^*0 1145 1150 

Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser 
J-li>b 1160 1165 

''^^ ff?n^^'' ^^"^ ^^"^ ser Ala Pro Ser 

•"•-"-^O 1175 1180 

Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp Leu 

1^90 1195 1200 

He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr 
, , ^^^^ 1210 1215 

Arg val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp Pro 

1220 1225 1230 

Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu He 
1235 1240 1245 

^^"^ ^in^^^ ^^"^ -^^^ Met Pro He Trp Ala Arg 

^^^^ 1255 1260 
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Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr 
1265 1270 1275 1280 

Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro lie Lys Ala Pro . 

1285 1290 1295 

Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser 

1300 1305 1310 

Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser. 

1315 1320 1325 

Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro Asp 

1330 1335 1340 

Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr ' Ser 
1345 1350 1355 1360 

Ser Met Pro Pro Leu Glu Gly Glu Pro Giy Asp Pro Asp , Leu Ser Asp 

1365 1370 1375 

Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val Cys 

1380 1385 1390 

Cys Ser Met Ser . Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys Ala 

1395 1400 1405 

Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu 

1410 1415 1420 

Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly Leu 
1425 1430 1435 1440 

Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp His 

1445 1450 1455 

Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys 

1460 1465 1470 

Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His 

1475 1480 1485 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu 

' 1490 1495 1500 - 

Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu Leu 
1505 1510 1515 1520 

Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu 

1525 1530 1535 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 

1540 1545 1550 

lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

1555 1560 1565 

Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser . Ser Tyr 

1570 1575 1580 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn Thr 
1585 1590 1595 1600 

Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 

1605 1610 1615 

Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser He 

1620 1625 1630 

Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys Ser 

1635 1640 1645 

Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys Gly 

1650 1655 1660 

Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1665 1670 1675 1680 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys 
1685 1690 1695 / 
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Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp Asp 

1700 1705 1710 

Leu Val Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser 
1715 1720 1725 

^^J^^ Met Thr Arg Tyr Ser Ala Pro Pro Gly 

■ ^^30 1735 1740 

Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 

1750 1755 j^^gQ 

Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr Tyr 
, 1765 1770 1775 

Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 

1780 1785 . 1790 

Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Tyr 
1795 1800 , 1805 

fo?n Met Thr His Phe Phe Ser 

■'.^810 1815 1820 

lie Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin He 

^630 1835 1840 

Tyr Gly Ala Cys Tyr Ser He Glu Pro . Leu Asp Leu Pro Gin He He 

1845 1850 1855 

Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 

I860 1865 1870 

Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val Pro 
1875 1880 1885 

^f^n^""^ ^Sr ser Val Arg Ala Arg Leu 

1S50 1895 1900 

Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn 

1910 1915 1920 

Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala Ser 

1925 1930 1935 

Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp 

1940 1945 1950 

lie Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu Cys 
1955 I960 1965 

^'^iJo^®" Tyr Leu Leu Pro Asn Arg 

1970 1975 1980 

<210> 7 
<211> 4909 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pVlJ nucleic acid 

<400> 7 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 
llf-tt^t^^ tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 
l^^il^^r.^ ttgcatacgt tgtatccata tcataatatg tacatttata ttgg?tca?S 300 
^^^^f''^"^ ccgccatgtt gacattgatt attgactagt tattaatagt aalcaatta? 360 
clltlltll^ gttcatagcc catatatgga gttccgcgtt acataactla cggtaaatgg 420 
llttltitJ.^ tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcl 480 
catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 
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tgcccaicttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600;. 

tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatgg[g actttcctac 660 

ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 

catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 

cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 

ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 900 

agctcgttta gtgaaccgtc agatcgcctg gagacgccat ccacgctgtt ttgacctcca 960. 

tagaagacac cgggaccgat ccagcctccg cggccgggaa cggtgcattg gaacgcggat 1020 

tccccgtgcc aagagtgacg taagtaccgc ctatagactc tataggcaca cccctttggc 1080 

tcttatgcat gctatactgt ttttggcttg gggcctatac acccccgctt ccttatgcta 1140 

taggtgatgg tatagcttag cctataggtg tgggttattg accattattg accactcccc 12 00 

tattggtgac gatactttcc attactaatc cataacatgg ctctttgcca caactatctc 1260 

tattggctat atgccaatac tctgtccttc agagactgac acggactctg tatttttaca 132Q 

ggatggggtc ccatttatta tttacaaatt cacatataca acaacgccgt cccccgtgcc 1380 

cgcagttttt attaaacata gcgtgggatc tccacgcgaa tctcgggtac gtgttccgga 1440 

catgggctct tctccggtag cggcggagct tccacatccg agccctggtc ccatgcctcc 1500 
agcggctcat ggtcgctcgg cagctccttg ctcctaacag tggaggccag acttaggcac.. 1560 

agcacaatgc ccaccaccac cagtgtgccg cacaaggccg tggcggtagg gtatgtgtct 1620 

gaaaatgagc gtggagattg ggctcgcacg gctgacgcag atggaagact taaggcagcg 1680 

gcagaagaag atgcaggcag ctgagttgtt gtattctgat aagagtcaga ggtaactccc 1740- 

gttgcggtgc tgttaacggt ggagggcagt gtagtctgag cagtactcgt tgctgccgcg 1800 

cgcgccacca gacataatag ctgacagact aacagactgt tcctttccat gggtcttttc 1860 

tgcagtcacc gtccttagat ctaggtacca gatatcagaa ttcagtcgac agcggccgcg 1920 

atctgctgtg. ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt 1980 

gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca 2040 

ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gfggcaggaca gcaaggggga 2100 

ggattgggaa gacaatagca ggcatgctgg ggatgcggtg ggctctatgg ccgctgcggc 2160 

caggtgctga agaattgacc cggttcctcc tgggccagaa agaagcaggc acatcccctt , 2220 

ctctgtgaca caccctgtcc acgcccctgg ttcttagttc cagccccact cataggacac 2280 

tcatagctca ggagggctcc gccttcaatc ccacccgcta aagtacttgg agcggtctct 2340 

ccctccctca tcagcccacc aaaccaaacc tagcctccaa gagtgggaag aaattaaagc 2400 

aagataggct attaagtgca gagggagaga aaatgcctcc aacatgtgag gaagtaatga 2460 

gagaaatcat agaatttctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 2520 

gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 2580 

ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa v 2640 

ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 2700 

acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 27 60 

tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 2820 

ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 2880 

ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 2940 

ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 3000 

actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 3060 

gttcttgaag tggtggccta actacggcta cactag[aaga acagtatttg gtatctgcgc 3120 

tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 3180 

caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 3240 

atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 3300 

acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 3360 

ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 3420 

ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 3480 

tgcctgactc gggggggggg ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc 3540 

ataccaggcc tgaatcgccc catcatccag ccagaaagtg agggagccac ggttgatgag 3600 

agctttgttg taggtggacc agttggtgat tttgaacttt tgctttgcca cggaacggtc 3660 

tgcgttgtcg ggaagatgcg tgatctgatc cttcaactca gcaaaagttc gatttattca 3720 

acaaagccgc cgtcccgtca agtcagcgta atgctctgcc agtgttacaa ccaattaacc ' 3780 
aattctgatt agaaaaactc atcgagcatc aaatgaaact gcaatttatt catatcagga . 3840 
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ttatcaatac catatttttg aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg 3900 

cagttccata ggatggcaag atcctggtat cggtctgcga ttccgactcg tccaacatca 3960 

atacaaccta ttaatttccc ctcgtcaaaa ataaggttat caagtgagaa atcaccatga 4020 

gtgacgactg aatccggtga gaatggcaaa agcttatgca tttctttcca gacttgttca 4080 

.; acaggccagc cattacgctc gtcatcaaaa tcactcgcat caaccaaacc gttattcatt 4140 

cgtgattgcg cctgagcgag acgaaatacg cgatcgctgt taaaaggaca attacaaaca 4200 

ggaatcgaat gcaaccggcg caggaacact gccagcgcat caacaatatt ttcacctgaa 4260 

tcaggatatt cttctaatac ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac 4320 

catgcatcat caggagtacg gataaaatgc ttgatggtcg gaagaggcat aaattccgtc 4380 

agccagttta gtctgaccat ctcatctgta acatcattgg caacgctacc tttgccatgt 4440 

ttcagaaaca actctggcgc atcgggcttc ccatacaatc gatagattgt cgcacctgat 4500 

tgcccgacat tatcgcgagc ccatttatac ccatataaat cagcatccat gttggaattt 4560 

aatcgcggcc tcgagcaaga cgtttcccgt tgaatatggc tcataacacc ccttgtatta 4620 

ctgtttatgt aagcagacag ttttattgtt catgatgata tatttttatc ttgtgcaatg 4680 

taacatcaga gattttgaga cacaacgtgg ctttcccccc ccccccatta ttgaagcatt 4740 

tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa 4800 

acaggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga aaccattatt 4860 

atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtc 4909 

<210> 8 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 6 
<400> 8 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 

llflft^^^ gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 

gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg ISO 

f^?""^^ gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 240 

taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag tcagctgacg tgtagtgtat ttatacccgg 480 

tgagttcctc aagaggccac tcttgagtgc cagcgagtag agttttctcc tccgagccgc 540 

tccgacaccg ggactgaaaa tgagacatat tatctgccac ggaggtgtta ttaccgaaga 600 

aatggccgcc agtcttttgg accagctgat cgaagaggta ctggctgata atcttccacc 660 

tcctagccat tttgaaccac ctacccttca cgaactgtat gatttagacg tgacggcccc 720 

cgaagatccc aacgaggagg cggtttcgca gatttttccc gactctgtaa tgttggcggt 780 

gcaggaaggg attgacttac tcacttttcc gccggcgccc ggttctccgg agccgcctca 840 

cctttcccgg cagcccgagc agccggagca gagagccttg ggtccggttt ctatgccaaa 900 

ccttgtaccg gaggtgatcg atcttacctg ccacgaggct ggctttccac ccagtgacga 960 

cgaggatgaa gagggtgagg agtttgtgtt agattatgtg gagcaccccg ggcacggttg 1020 

caggtcttgt cattatcacc ggaggaatac gggggaccca gatattatgt gttcgctttg 1080 

ctatatgagg acctgtggca tgtttgtcta cagtaagtga aaattatggg cagtgggtga 1140 

tagagtggtg ggtttggtgt ggtaattttt tttttaattt ttacagtttt gtggtttaaa 1200 

gaattttgta ttgtgatttt tttaaaaggt cctgtgtctg aacctgagcc tgagcccgag 1260 

ccagaaccgg agcctgcaag acctacccgc cgtcctaaaa tggcgcctgc tatcctgaga 1320 

cgcccgacat cacctgtgtc tagagaatgc aatagtagta cggatagctg tgactccggt 1380 

ccctctaaca cacctcctga gatacacccg gtggtcccgc tgtgccccat taaaccagtt 1440 

gccgtgagag ttggtgggcg tcgccaggct gtggaatgta tcgaggactt gcttaacgag 1500 

cctgggcaac ctttggactt gagctgtaaa cgccccaggc cataaggtgt aaacctgtga 1560 

ttgcgtgtgt ggttaacgcc tttgtttgct gaatgagttg atgtaagttt aataaagggt 1620 

gagataatgt ttaacttgca tggcgtgtta aatggggcgg ggcttaaagg gtatataatg 1680 

cgccgtgggc taatcttggt tacatctgac ctcatggagg cttgggagtg tttggaagat 1740 

ccctctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 
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tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgettt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt . 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta t.cctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc . .3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420' 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3 540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg 4800 

btttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 

ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 
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caaagttttt 
gcagttccag 
ctcctcgttt 
acgggccagg 
ggtgaagggg 
ggtgctgaag 
gtcatagtcc 
gccgcacgag 
ttccggggag 
ggtgagctct 
cttacctctg 
cccgtataca 
: .aaactcggac 
ggaggggtag 
gtcgccctct 
tgttcctgaa 
atcgctgtct 
ttctgcgcta 
ggtgatgcct 
aagcttggtg 

ggtttggttt 
gcgcgcaacg 
gcgccaaccg 
gcgctcgttg 
tagctgcgtc 
gtcgaagtag 
aagcgcgcgc 
ggcgtacatg 
agggtagcat 
agcgaggagg 
cctgaagatg 
gtctgtgaga 
cagctcggcg 
atacttatcc 
tttccagtac 
gaactggttg 
cgcggccttc 
gtactggtat 
gcgctttttg 
cgcgcgaggc 
aattacctgg 
aagttccaag 
gagctcttca 
ggaagcgacg 
ggtcctaaac 
gtcttgttcc 
aggctcatct 
ccccatccaa 
cgagccgatc 
gtgaaagtag 
gcagtactgg 
cacaaggaag 
tacttcggct 
caccacgccg 
aacatcgcgc 



caacggtttg 
Sfcggtcccac 
cgcgggttgg 
gtcatgtctt 
tgcgctccgg 
cgctgccggt 
agcccctccg 
gggcagtgca 
taggcatccg 
ggccgttcgg 
gtttccatga 
gacttgagag 
cactctgaga 
cggtcgttgt 
tcggcatcaa 

ggggggctat 

gcgagggcca 
agattgtcag 
ttgagggtgg 
gcaaacgacc 
ttgtcgcgat 
caccgccatt 
cggttgtgca 
gtccagcaga 
tcgtccgggg 
tctatcttgc 
tcgtatgggt 
ccgcaaatgt 
cttccaccgc 
tcgggaccga 
gcatgtgagt 
cctaccgcgt 
gtgacctgca 
tgtccctttt 
tcttggatcg 
acggcctggt 
cggagcgagg 
ttgaagtcag 
gaacgcggat 
ataaagttgc 
gcggcgagca 
aagcgcggga 
ggggagctga 
aatgagctcc 
tggcgaccta 
cagcggtccc 
ccgccgaact 
gtataggtct 
gggaagaact 
aagtccctgc 
cagcggtgca 
cagagtggga 
gcttgtcctt 
cgcgagccca 
agatgggagc 



agaccgtccg 
agctcggtca 
ggcggctttc 
tccacgggcg 
gctgcgcgct 
cttcgccctg 
cggcgtggcc 
gacttttgag 
cgccgcaggc 
ggtcaaaaac 
gccggtgtcc 
gcctgtcctc 
caaaggctcg 
ccactagggg 
ggaaggtgat 
aaaagggggt 
gctgttgggg 
tttccaaaaa 
ccgcatccat 
cgtagagggc 
cggcgcgctc 
cgggaaagac 
gggtgacaag 
ggcggccgcc 
ggtctgcgtc 
atccttgcaa 
tgagtggggg 
cgtaaacgta 
ggatgctggc 
ggttgctacg 
tggatgatat 
cacgcacgaa 
cgtctagggc 
ttttccacag 
gaaacccgtc 
aggcgcagca 
tgtgggtgag 
tgtcgtcgca 
ttggcagggc 
gtgtgatgcg 
cgatctcgtc 
tgcccttgat 
gcccgt'gctc 
acaggtcacg 
tggccatttt 
atccaaggtt 
tcatgaccag 
ctacatcgta 
ggatctcccg 
gacgggccga 
cgggctgtac 
atttgagccc 
gaccgtctgg 
aagtccagat 
tgtccatggt 



ccgtaggcat 
cctgctctac 
gctgtacggc 
cagggtcctc 
ggccagggtg 
cgcgtcggcc 
cttggcgcgc 
ggcgtagagc 
cccgcagacg 
caggtttccc 
acgctcggtg 
gagcggtgtt 
cgtccaggcc 
gtccactcgc 
tggtttgtag 
gggggcgcgt 

tgagtactcc 
cgaggaggat 
ctggtcagaa 
gttggacagc 
cttggccgcg 
ggtggtgcgc 
gtcaacgctg 
cttgcgcgag 
cacggtaaag 
gtctagcgcc 
accccatggc 
gaggggctct 
gcgcacgtaa 
ggcgggctgc 
ggttggacgc 
ggaggcgtag 
gcagtagtcc 
ctcgcggttg 
ggcctccgaa 
tcccttttct 



cgcaaaggtg 
tccgccctgc 
gaaggtgaca 
gaagggtccc 
aaagccgttg 
ggaaggcaat 
tgaaagggcc 
ggccattagc 
ttctggggtg 
cgcggctagg 
catgaagggc 
ggtgacaaag 
ccaccaattg 
acactcgtgc 
atcctgcacg 
ctcgcctggc 
ctgctcgagg 
gtccgcgcgc 
ctggagctcc 



gcttttgagc 
ggcatctcga 
agtagtcggt 
gtcagcgtag 
cgcttgaggc 
aggtagcatt 
agcttgccct 
ttgggcgcga 
gtctcgcatt 
ccatgctttt 
acgaaaaggc 
ccgcggtcct 
agcacgaagg 
tccagggtgt 
gtgtaggcca 
tcgtcctcac 
ctctgaaaag 
ttgatattca 
aagacaatct 
aacttggcga 
atgtttagct 
tcgtcgggca 
gtggctacct 
cagaatggcg 
accccgggca 
tgctgccatg 
atggggtggg 
ctgagtattc 
tcgtatagtt 
tctgctcgga 
tggaagacgt 
gagtcgcgca 
agggtttcct 
aggacaaact 
cggtaagagc 
acgggtagcg 
tccctgacca 
tcccagagca 
tcgttgaaga 
ggcacctcgg 
atgttgtggc 
tttttaagtt 
cagtctgcaa 
atttgcaggt 
atgcagtaga 
tctcgcgcgg 
acgagctgct 
agacgctcgg 
gaggagtggc 
tggcttttgt 
aggttgacct 
gggtttggct 
ggagttacgg 
ggcggtcgga 
cgcggcgtca 



gtttgaccaa 
tccagcatat 
gctcgtccag 
tctgggtcac 
tggtcctgct 
tgaccatggt 
tggaggaggc 
gaaataccga 
ccacgagcca 
tgatgcgttt 
tgtccgtgtc 
cctcgtatag 
aggctaagtg 
gaagacacat 
cgtgaccggg 
tctcttccgc 
cgggcatgac 
cctggcccgc 
ttttgttgtc 
tggagcgcag 
gcacgtattc 
ccaggtgcac 
ctccgcgtag 
gtagggggtc 
gcaggcgcgc 
cgcgggcggc 
tgagcgcgga 
caagatatgt 
cgtgcgaggg 
agactatctg 
tgaagctggc 
gcttgttgac 
tgatgatgtc 
cttcgcggtc 
ctagcatgta 
cgtatgcctg 
tgactttgag 
aaaagtccgt 
gtatctttcc 
aacggttgtt 
ccacaatgta 
cctcgtaggt 
gatgagggtt 
ggtcgcgaaa 
aggtaagcgg 
cagtcactag 
tcccaaaggc 
tgcgaggatg 
tattgatgtg 
aaaaacgtgc 
gacgaccgcg 
ggtggtcttc 

tggatcggac 
gcttgatgac 
ggtcaggcgg 



5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 . 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 
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gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460, 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520.. 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgrtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgttccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag . 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180: 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaia tctcctcttc cataagggcc tccccttctt. cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 10620 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa i0860 

gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

. ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttt.tgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 11220 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460. 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt 11640 

gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 
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gtgcagcaca 
gagggccgct 
agcttgagcc 
ttttacgccc 
gaggggttct 
tatcgcaacg 
cgcgagctga 
gccgagtcct 
gaggcagctg 
ggcgtggagg 
gtgatgtttc 
agagccagcc 
tgtcgctgac 
ccgcaattct 
cgatcgtaaa 
acgacgcgct 
. accggctggt 
gcaacctggg 
cgcggggaca 
caccgcaaag 
gcctgcagac 

gggctcccac 
tgctgctgct 
gtcacttgct 
tccaggagat 
caaccctaaa 
acagcgagga 
gcgacggggt 
tgtatgcctc 
ccgtgaaccc 
Srtttctacac 
tagacgacag 
aggcagaggc 
gcgctgcggc 
ccagcactcg 
tgctgcagcc 
gcctagtgga 
gcccgcgccc 
acgatgactc 
cgcaccttcg 
aaaaactcac 
gcgcgcggcg 
gccagtggcg 
tccgcggtac 
cctattcgac 
gaactaccag 
cccgggggag 
cctgaaaacc 
gtttaaggcg 
atacgagtgg 
ccttatgaac 
ggaaagcgac 
cactggtctt 
gctgccagga 
caagcggcaa 



gcagggacaa 
ggctgctcga 
tggctgacaa 
gcaagatata 
acatgcgcat 
agcgcatcca 
tgcacagcct 
actttgacgc 
gggccggacc 
aatatgacga 
tgatcagatg 
gtccggcctt 
tgcgcgcaat 
ggaagcggtg 
cgcgctggcc 
gcttcagcgc 
gggggatgtg 

ctccatggtt 
ggaggactac 
tgaggtgtac 
cgtaaacctg 
aggcgaccgc 
aatagcgccc 
gacactgtac 
tacaagtgtc 
ctacctgctg 
ggagcgcatt 
aacgcccagc 
aaaccggccg 
cgagtatttc 
cgggggattc 
cgtgttttcc 
ggcgctgcga 
cccgcggtca 
caccacccgc 
gcagcgcgaa 
caagatgagt 
gcccacccgt 
ggcagacgac 
ccccaggctg 
caaggccatg 
atgtatgagg 
gcggcgctgg 
ctgcggccta 
accacccgtg 
aacgaccaca 
gcaagcacac 
atcctgcata 
cgggtgatgg 
gtggagttca 
aacgcgatcg 
atcggggtaa 
gtcatgcctg 
tgcggggtgg 
cccttccagg 



cgaggcattc 
tttgataaac 
ggtggccgcc 
ccatacccct 
ggcgctgaag 
caaggccgtg 
gcaaagggcc 
gggcgctgac 
tgggctggcg 
ggacgatgag 
atgcaagacg 
aactccacgg 
cctgacgcgt 
gtcccggcgc 
gaaaacaggg 
gtggctcgtt 
cgcgaggccg 
gcactaaacg 
accaactttg 
cagtctgggc 
agccaggctt 
gcgaccgtgt 
ttcacggaca 
cgcgaggcca 
agccgcgcgc 
accaaccggc 
ttgcgctacg 
gtggcgctgg 
tttatcaacc 
accaatgcca 
gaggtgcccg 
ccgcaaccgc 
aaggaaagct 
gatgctagta 
ccgcgcctgc 
aaaaacctgc 
agatggaaga 
cgtcaaaggc 
agcagcgtcc 
gggagaatgt 
gcaccgagcg 
aaggtcctcc 
gttctccctt 
ccggggggag 
tgtacctggt 
gcaactttct 
agaccatcaa 
ccaacatgcc 
tgtcgcgctt 
cgctgcccga 
tggagcacta 
agtttgacac 
gggtatatac 
acttcaccca 
agggctttag 



agggatgcgc 
atcctgcaga 
atcaactatt 
tacgttccca 
gtgcttacct 
agcgtgagcc 
ctggctggca 
ctgcgctggg 
gtggcacccg 
tacgagccag 
caacggaccc 
acgactggcg 
tccggcagca 
gcgcaaaccc 
ccatccggcc 
acaacagcgg 
tggcgcagcg 
ccttcctgag 
tgagcgcact 
cagactattt 
tcaaaaactt 
ctagcttgct 
gtggcagcgt 
taggtcaggc 
tggggcagga 
ggcagaagat 
tgcagcagag 
acatgaccgc 
gcctaatgga 
tcttgaaccc 
agggtaacga 
agaccctgct 
tccgcaggcc 
gcccatttcc 
tgggcgagga 
ctccggcatt 
cgtacgcgca 
acgaccgtca 
tggatttggg 
tttaaaaaaa 
ttggttttct 
tccctcctac 
cgatgctccc 
aaacagcatc 
ggacaacaag 
gaccacggtc 
tcttgacgac 
aaatgtgaac 
gcctactaag 
gggcaactac 
cttgaaagtg 
ccgcaacttc 
aaacgaagcc 
cagccgcctg 
gatcacctac 



tgctaaacat 
gcatagtggt 
ccatgcttag 
tagacaagga 
tgagcgacga 
Srgcggcgcga 
cgggcagcgg 
ccccaagccg 
cgcgcgctgg 
aggacggcga 
ggcggtgcgg 
ccaggtcatg 
gccgcaggcc 
cacgcacgag 
cgacgaggcc 
caacgtgcag 
tgagcgcgcg 
tacacagccc 
gcggctaatg 
tttccagacc 
gcaggggctg 
gacgcccaac 
gtcccgggac 
gcatgtggac 
ggacacgggc 
cccctcgttg 
cgtgagcctt 
gcgcaacatg 
ctacttgcat 
gcactggcta 
tggattcctc 
agagttgcaa 
aagcagcttg 
aagcttgata 
ggagtaccta 
tcccaacaac 
ggagcacagg 
gcggggtctg 
agggagtggc 
aaaaagcatg 
tgtattcccc 
gagagtgtgg 
ctggacccgc 
cgttactctg 
tcaacggatg 
attcaaaaca 
cggtcgcact 
gagttcatgt 
gacaatcagg 
tccgagacca 
ggcagacaga 
agactggggt 
ttccatccag 
agcaacttgt 
gatgatctgg 



agtagagccc 
gcaggagcgc 
cctgggcaag 
ggtaaagatc 
cctgggcgtt 
gctcagcgac 
cgatagagag 
acgcgccctg 
caacgtcggc 
gtactaagcg 
Srcggcgctgc 
gaccgcatca 
aaccggctct 
aaggtgctgg 
ggcctggtct 
accaacctgg 
cagcagcagg 
gccaacgtgc 
gtgactgaga 
agtagacaag 
tggggggtgc 
tcgcgcctgt 
acatacctag 
gagcatactt 
agcctggagg 
cacagtttaa 
aacctgatgc 
gaaccgggca 
cgcgcggccg 
ccgccccctg 
tgggacgaca 
cagcgcgagc 
tccgatctag 
gggtctctta 
aacaactcgc 
gggatagaga 
gacgtgccag 
gtgtgggagg 
aacccgtttg 
atgcaaaata 
ttagtatgcg 
tgagcgcggc 
cgtttgtgcc 
agttggcacc 
tggcatccct 
atgactacag 
ggggcggcga 
ttaccaataa 
tggagctgaa 
tgaccataga 
acggggttct 
ttgaccccgt 
acatcatttt 



tgggcatccg 
agggtggtaa 



11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14880 

14940 

15000 
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cattcccgca ctgttggatg tggacgccta ccaggcgagc ttgaaagatg acaccgaaca 15060 

gggcgggggt ggcgcaggcg gcagcaacag cagtggcagc ggcgcggaag agaactccaa 15120 

cgcggcagcc gcggcaatgc agccggtgga ggacatgaac gatcatgcca ttcgcggcga 15180 

cacctttgcc acacgggctg aggagaagcg cgctgaggcc gaagcagcgg ccgaagctgc 15240 

cgcccccgct gcgcaacccg aggtcgagaa gcctcagaag aaaccggtga tcaaacccct 15300 

gacagaggac. agcaagaaac gcagttacaa cctaataagc aatgacagca ccttcaccca 15360 

gtaccgcagc tggtaccttg catacaacta cggcgaccct cagaccggaa tccgctcatg . 15420 

gaccctgctt tgcactcctg acgtaacctg cggctcggag caggtctact ggtcgttgcc 15480 

agacatgatg caagaccccg tgaccttccg ctccacgcgc cagatcagca actttccggt 15540 

ggtgggcgcc gagctgttgc ccgtgcactc caagagcttc tacaacgace aggccgtcta 15600 

ctcccaactc atccgccagt ttacctctct gacccacgtg ttcaatcgct ttcccgagaa 15660 

ccagattttg gcgcgcccgc cagcccccac catcaccacc gtcagtgaaa acgttcctgc 15720 

tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc agcgagtgac 15780 

cattactgac gccagacgcc gcacctgccc ctacgtttac aaggccctgg gcatagtctc 15840 

gccgcgcgtc ctatcgagcc gcactttttg agcaagcatg tccatcctta tatcgcccag 15900 

caataacaca ggctggggcc tgcgcttccc aagcaagatg tttggcgggg ccaagaagcg 15960 

ctccgaccaa cacccagtgc gcgtgcgcgg gcactaccgc gcgccctggg gcgcgcacaa 16020 

acgcggccgc actgggcgca ccaccgtcga tgacgccatc gacgcggtgg tggaggaggc 16080 

gcgcaactac acgcccacgc cgccaccagt gtccacagtg gacgcggcca ttcagaccgt 16140 

ggtgcgcgga gcccggcgct atgctaaaat gaagagacgg cggaggcgcg tagcacgtcg 16200 

ccaccgccgc cgacccggca ctgccgccca acgcgcggcg gcggccctgc ttaaccgcgc 16260 

acgtcgcacc ggccgacggg cggccatgcg ggccgctcga aggctggccg cgggtattgt 16320 

cactgtgccc cccaggtcca ggcgacgagc ggccgccgca gcagccgcgg ccattagtgc 16380 

tatgactcag ggtcgcaggg gcaacgtgta ttgggtgcgc gactcggtta gcggcctgcg 16440 

cgtgcccgtg cgcacccgcc ccccgcgcaa ctagattgca agaaaaaact acttagactc 16500 

gtactgttgt atgtatccag cggcggcggc gcgcaacgaa gctatgtcca agcgcaaaat 16560 

caaagaagag atgctccagg tcatcgcgcc ggagatctat ggccccccga agaaggaaga 16620 

gcaggattac aagccccgaa agctaaagcg ggtcaaaaag aaaaagaaag atgatgatga 16680 

tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggcgac gggtacagtg 16740 

gaaaggtcga cgcgtaaaac gtgttttgcg acccggcacc accgtagtct ttacgcccgg 16800 

tgagcgctcc acccgcacct acaagcgcgt gtatgatgag gtgtacggcg acgaggacct 16860 

gcttgagcag gccaacgagc gcctcgggga gtttgcctac ggaaagcggc ataaggacat 16920 

gctggcgttg ccgctggacg agggcaaccc aacacctagc ctaaagcccg taacactgca 16980 

gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc gcgagtctgg 17040 

tgacttggca cccaccgtgc agctgatggt acccaagcgc cagcgactgg aagatgtctt 17100 

- ggaaaaaatg accgtggaac ctgggctgga gcccgaggtc cgcgtgcggc caatcaagca 17160 

ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atacccacta ccagtagcac 17220 

cagtattgcc accgccacag agggcatgga gacacaaacg tccccggttg cctcagcggt 17280 

ggcggatgcc gcggtgcagg cggtcgctgc ggccgcgtcc aagacctcta cggaggtgca 17340 

aacggacccg tggatgtttc gcgtttcagc cccccggcgc. ccgcgcggtt cgaggaagta 17400. 

cggcgccgcc agcgcgctac tgcccgaata tgccctacat ccttccattg cgcctacccc 17460 

cggctatcgt ggctacacct accgccccag aagacgagca actacccgac gccgaaccac 17520 

cactggaacc cgccgccgcc gtcgccgtcg ccagcccgtg ctggccccga tttccgtgcg 17580 

cagggtggct cgcgaaggag gcaggaccct ggtgctgcca acagcgcgct accaccccag 17640 

catcgtttaa aagccggtct ttgtggttct tgcagatatg gccctcacct gccgcctccg 17700 

ttccccggtg ccgggattcc gaggaagaat gcaccgtagg aggggcatgg ccggccacgg 17760 

cctgacgggc ggcatgcgtc gtgcgcacca ccggcggcgg cgcgcgtcgc accgtcgcat 17820 

gcgcggcggt atcctgcccc tccttattcc actgatcgcc gcggcgattg gcgccgtgcc 17880 

cggaattgca tccgtggcct tgcaggcgca gagacactga ttaaaaacaa gttgcatgtg 17940 

gaaaaatcaa aataaaaagt ctggactctc acgctcgctt ggtcctgtaa ctattttgta 18000 

gaatggaaga catcaacttt gcgtctctgg ccccgcgaca cggctcgcgc ccgttcatgg 18060 

gaaactggca agatatcggc accagcaata tgagcggtgg cgccttcagc tggggctcgc 18120 

tgtggagcgg cattaaaaat ttcggttcca ccgttaagaa ctatggcagc aaggcctgga 18180 

acagcagcac aggccagatg ctgagggata agttgaaaga gcaaaatttc caacaaaagg 18240 

tggtagatgg cctggcctct ggcattagcg gggtggtgga cctggccaac caggcagtgc 18300 
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aaaataagat 
tggagacagt 
ctctggtgac 
ccaccacccg 
cgctggacct 
. ccgttgttgt 
cgttgcggcc 
gggtgcaatc 
atgtatgcgt 
gatggctacc 
ctcggagtac 
cctgaataac 
;gtcccagcgt 
caaggcgcgg 
ctttgacatc 
ctacaacgcc 
tgctcttgaa 
agctgagcag 
aaaggagggt 
tcaacctgaa 
tgggagagtc 
cacaaatgaa 
tcaagtggaa 
gactcctaaa 
ttcttacatg 
gcccaacagg 
cagcacgggt 
tttgcaagac 
aaccaggtac 
tattgaaaat 
gattaataca 
aaaagatgct 
ggaaatcaat 
tttgcccgac 
ctacgactac 
tggagcacgc 
tgctggcctg 
ccaggtgcct 
ctacgagtgg 
cctaagggtt 
ccccatggcc 
ccagtccttt 
taccaacgtg 
cacgcgcctt 
ctactctggc 
ggtggccatt 
caacgagttt 
catgaccaaa 
cttctatatc 
catgagccgt 
acaccaacac 
ggcctaccct 
ccagaaaaag 
gtccatgggc 
gctagacatg 



taacagtaag 
gtctccagag 
gcaaatagac 
tcccatcgcg 
gcctcccccc 
aacccgtcct 
cgtagccagt 
cctgaagcgc 
ccatgtcgcc 
ccttcgatga 
ctgagccccg 
aagtttagaa 
ttgacgctgc 
ttcaccctag 
cgcggcgtgc 
ctggctccca 
ataaacctag 
caaaaaactc 
attcaaatag 
cctcaaatag 
cttaaaaaga 
aatggagggc 
atgcaatttt 
gtggtattgt 
cccactatta 
cctaattaca 
aatatgggtg 
agaaacacag 
ttttctatgt 
catggaactg 
gagactctta 
acagaatttt 
ctaaatgcca 
aagctaaagt 
atgaacaagc 
tggtcccttg 
cgctaccgct 
cagaagttct 
aacttcagga 
gacggagcca 
cacaacaccg 
aacgactatc 
cccatatcca 
aagactaagg 
tctataccct 
acctttgact 
gaaattaagc 
gactggttcc 
ccagagagct 
caggtggtgg 
aacaactctg 
gctaacttcc 
tttctttgcg 
gcactcacag 
acttttgagg 



cttgatcccc 
gggcgtggcg 
gagcctccct 
cccatggcta 
gccgacaccc 
agccgcgcgt 
ggcaactggc 
cgacgatgct 
gccagaggag 
tgccgcagtg 
ggctggtgca 
accccacggt 
ggttcatccc 
ctgtgggtga 
tggacagggg 
agggtgcccc 
aagaagagga 
acgtatttgg 
gtgtcgaagg 
gagaatctca 
ctaccccaat 
aaggcattct 
tctcaactac 
acagtgaaga 
aggaaggtaa 
ttgcttttag 
ttctggcggg 
agctttcata 
ggaatcaggc 
aagatgaact 
ccaaggtaaa 
cagataaaaa 
acctgtggag 
acagtccttc 
gagtggtggc 
actatatgga 
caatgttgct 
ttgccattaa 
aggatgttaa 
gcattaagtt 
cctccacgct 
tctccgccgc 
tcccctcccg 
aaaccccatc 
acctagatgg 
cttctgtcag 
gctcagttga 
tggtacaaat 
acaaggaccg 
atgatactaa 
gatttgttgg 
cctatccgct 
atcgcacccfc 
acctgggcca 
tggatcccat 



gccctcccgt 
aaaagcgtcc 
cgtacgagga 
ccggagtgct 
agcagaaacc 
ccctgcgccg 
aaagcacact 
tctgaatagc 
ctgctgagcc 
gtcttacatg 
gtttgcccgc 
ggcgcctacg 
tgtggaccgt 
taaccgtgtg 
ccctactttt 
aaatccttgc 
cgatgacaac 
gcaggcgcct 
tcaaacacct 
gtggrtacgaa 
gaaaccatgt 
tgtaaagcaa 
tgaggcgacc 
tgtagatata 
ctcacgagaa 
ggacaatttt 
ccaagcatcg 
ccagcttttg 
tgttgacagc 
tccaaattac 
acctaaaaca 
tgaaataaga 
aaatttcctg 
caacgtaaaa 
tcccgggtta 
caacgtcaac 
gggcaatggt 
aaacctcctt 
catggttctg 
tgatagcatt 
tgaggccatg 
caacatgctc 
caactgggcg 
actgggctcg 
aaccttttac 
ctggcctggc 
cggggagggt 
gctagctaac 
catgtactcc 
atacaaggac 
ctaccttgcc 
tataggcaag 
ttggcgcatc 
aaaccttctc 
ggacgagccc 



dgaggagcct 
gcgccccgac 
ggcactaaag 
gggccagcac 
tgtgctgcca 
cgccgccagc 
gaacagcatc 
taacgtgtcg 
gccgcgcgcc 
cacatctcgg 
gccaccgaga 
cacgacgtga 
gaggatactg 
ctggacatgg 
aagccctact 
gaatgggatg 
gaagacgaag 
tattctggta 
aaatatgccg 
actgaaatta 
tacggttcat 
caaaatggaa 
gcaggcaatg 
gaaaccccag 
ctaatgggcc 
attggtctaa 
cagttgaatg 
cttgattcca 
tatgatccag 
tgctttccac 
ggtcaggaaa 
gttggaaata 
tactccaaca 
atttctgata 
gtggactgct 
ccatttaacc 
cgctatgtgc 
ctcctgccgg 
cagagctccc 
tgcctttacg 
cttagaaacg 
taccctatac 
gctttccgcg 
ggctacgacc 
ctcaaccaca 
aatgaccgcc 
tacaacgttg 
tacaacattg 
ttctttagaa 
taccaacagg 
cccaccatgc 
accgcagttg 
ccattctcca 
tacgccaact 
acccttcttt 



ccaccggccg 
agggaagaaa 
caaggcctgc 
acacccgtaa 
ggcccgaccg 
ggtccgcgat 
gtgggtctgg 
tatgtgtgtc 
cgctttccaa 
gccaggacgc 
cgtacttcag 
ccacagaccg 
cgtactcgta 
cttccacgta 
ctggcactgc 
aagctgctac 
tagacgagca 
taaatattac 
ataaaacatt 
atcatgcagc 
atgcaaaacc 
agctagaaag 
gtgataactt 
acactcatat 
aacaatctat 
tgtattacaa 
ctgttgtaga 
ttggtgatag 
atgttagaat 
tgggaggtgt 
atggatggga 
attttgccat 
tagcgctgta 
acccaaacac 
acattaacct 
accaccgcaa 
ccttccacat 
gctcatacac 
taggaaatga 
ccaccttctt 
acaccaacga 
ccgccaacgc 
gctgggcctt 
cttattacac 
cctttaagaa 
tgcttacccc 
cccagtgtaa 
gctaccaggg 
acttccagcc 
tgggcatcct 
gcgaaggaca 
acagcattac 
gtaactttat 
ccgcccacgc 
atgttttgtt 



18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19260 

19320 

19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 

20400 

20460 

20520 

20580 

20640 

20700 

20760 

20820 

20880 

20940 

21000 

21060 

21120 

21180 

21240 

21300 

21360 

21420 

21480 

21540 

21600 
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tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720' 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agatcttggt 21780 

tgtgggccat attttttggg cacctatgac aagcgctttc caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgcgaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagccctt tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 22020 

attgcttctt cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgctcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agccacagtg cgcagattag gagcgccact 22320 

. tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctttt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tcgctatgcg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagctcggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggcgccgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct cttgtcggag 22740 

atcagatccg cgtccaggtc ctccgcgttg ctcagggcga acggagtcaa ctttggtagc 22800 

tgccttccca aaaagggcgc gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg cctgcataaa agccttgatc 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgccg 2298 0 

gaaaactgat tggccggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggccttgct agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcacgtg ctccttattt 23160 

atcataatgc ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa cgactgcagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

. tgcaacccgc ggtgctcctc gttcagccag gtcttgcata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agatcgttat ccacgtggta cttgtccatc 23460 

. agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcagcggg 23520 

t;tcatcaccg. taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ccatgcttga ttagcaccgg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggcgcttct ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc gcccggggag gcggcggcga cggggacggg 23940 

gacgacacgt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctcgggggtg 24000 

. gtttcgcgct gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

atggagtcag tcgagaagaa ggacagccta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gacaacgcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggcgactac ctagatgtgg gagacgacgt gctgttgaag 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480 

cccaaacgcc aagaaaacgg cacatgcgag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtgct tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg gcagggcgct 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tcttggacgc 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 

gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc . 24900 
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atgagtgagc 
caaacagagg 
cgcgagcctg 
srtggagcttg 
gaaacattgc 
gtggagctct 
aacgtgcttc 
tacttatttc 



gagtgcaacc 
gccttcaacg 
cttaaaaccc 
aggaacttta 
gactttgtgc 
ctgcagctag 
. ggtctactgg 
aattcgcagc 
cctgacgaaa 
taccttcgca 
caatcccgcc 
ggccaattgc 
gtttacttgg 
tatcagcagc 
gccgccgcca 
cgaggaggag 
cgaagaggtg 
gaaatcggca 
gcccgttcgc 
gcagccgccg 
gcacaagaac 
ccgctttctt 
tcatctctac 
cacagaagca 
cggcagcagc 
agcttagaaa 
aacaagagct 
acaaaagcga 
actgcgcgct 
tacgtcatct 
aggaaattcc 
ctgcccaaga 
gggtcaacgg 
ccacacctcg 
gtcccgctcc 
actcaggggc 
taactcacct 
cgcttggtct 
cgcctcgtca 
ttggaactct 
gacctcccgg 
cggacggcta 
tccactgtcg 
tgcccgagga 
ttgcccgtag 
gaccctgtgt 
gttgccatct 



tgatcgtgcg 
agggcctacc 
ccgacttgga 
agtgcatgca 
actacacctt 
gcaacctggt 
attccacgct 
tatgctacac 
tcaaggagct 
agcgctccgt 
tgcaacaggg 
tcctagagcg 
ccattaagta 
ccaactacct 
agtgtcactg 
tgcttaacga 
agtccgcggc 
aatttgtacc 
cgccaaatgc 
aagccatcaa 
acccccagtc 
agccgcgggc 
cccacggacg 
gaggacatga 
tcagacgaaa 
accggttcca 
cgacccaacc 
ccgttagccc 
gccatagttg 
ctctaccatc 
agcccatact 
aaggcgaccg 
aggaggagga 
caggattttt 
gaaaataaaa 
agatcagctt 
gactcttaag 
ccagcggcca 
cacgccctac 
ctactcaacc 
aatccgcgcc 
taataacctt 
caccactgtg 
gcagcttgcg 
gacaatcaga 
ccgtccggac 
ggcaatccta 
gcaatttatt 
ccactatccg 
cgactgaatg 
ccgccacaag 
tcatatcgag 
cctgattcgg 
tctcactgtg 
ctgtgctgag 



ccgtgcgcag 
cgcagttggc 
ggagcgacgc 
gcggttcttc 
tcgacagggc 
ctcctacctt 
caagggcgag 
ctggcagacg 
gcagaaactg 
ggccgcgcac 
tctgccagac 
ctcaggaatc 
ccgcgaatgc 
tgcctaccac 
tcgctgcaac 
aagtcaaatt 
tccggggttg 
tgaggactac 
ggagcttacc 
caaagcccgc 
cggcgaggag 
ccttgcttcc 
aggaggaata 
tggaagactg 
caccgtcacc 
gcatggctac 
gtagatggga 
aagagcaaca 
cttgcttgca 
acggcgtggc 
gcaccggcgg 
gatagcaaga 
gcgctgcgtc 
cccactctgt 
aacaggtctc 
cggcgcacgc 
gactagtttc 
cacccggcgc 
atgtggagtt 
cgaataaact 
caccgaaacc 
aatccccgta 
gtacttccca 
ggcggctttc 
gggcgaggta 
gggacatttc 
actctgcaga 
gaggagtttg 
gatcaattta 
ttaagtggag 
tgctttgccc 
ggcccggcgc 
gagtttaccc 
atttgcaact 
tataataaat 



cccctggaga 
gacgagcagc 
aaactaatga 
gctgacccgg 
tacgtacgcc 
ggaattttgc 
gcgcgccgcg 
gccatgggcg 
ctaaagcaaa 
ctggcggaca 
ttcaccagtc 
ttgcccgcca 
cctccgccgc 
tctgacataa 
ctatgcaccc 
atcggtacct 
aaactcactc 



cacgcccacg 
gcctgcgtca 
caagagtttc 
ctcaacccaa 
caggatggca 
ctgggacagt 
ggagagccta 
ctcggtcgca 
aacctccgct 
caccactgga 
acagcgccaa 
agactgtggg 
cttcccccgt 
cagcggcagc 
ctctgacaaa 
tggcgcccaa 
atgctatatt 
tgcgatccct 
tggaagacgc 
gcgccctttc 
cagcacctgt 
accagccaca 
acatgagcgc 
gaattctctt 
gttggcccgc 
gagacgccca 
gtcacagggt 
ttcagctcaa 
agatcggcgg 
cctcgtcctc 
tgccatcggt 
ttcctaactt 



aggcagagca 
gcgactccgg 
acggcgtccg 
agcgccccct 
gtcctaacct 
acagaaatta 



gggatgcaaa 
tagcgcgctg 
tggccgcagt 
agatgcagcg 
aggcctgcaa 
acgaaaaccg 
actacgtccg 
tttggcagca 
acttgaagga 
tcattttccc 
aaagcatgtt 
cctgctgtgc 
tttggggcca 
tggaagacgt 
cgcaccgctc 
ttgagctgca 
cggggctgtg 
agattaggtt 
ttacccaggg 
tgctacgaaa 
tccccccgcc 
cccaaaaaga 
caggcagagg 
gacgaggaag 
ttcccctcgc 
cctcaggcgc 
accagggccg 
ggctaccgct 
ggcaacatct 
aacatcctgc 
ggcagcaaca 
gcccaagaaa 
cgaacccgta 
tcaacagagc 
cacccgcagc 
ggaggctctc 
tcaaatttaa 
cgtcagcgcc 
aatgggactt 
gggaccccac 
ggaacaggcg 
tgccctggtg 
ggccgaagtt 
gcggtcgccc 
cgacgagtcg 
cgccggccgt 
tgagccgcgc 
ctactttaac 
tgacgcggta 
actgcgcctg 
tgagttttgc 
gcttaccgcc 
gctagttgag 
tggattacat 
aaatatactg 



tttgcaagaa 
gcttcaaacg 
gctcgttacc 
caagctagag 
gatctccaac 
ccttgggcaa 
cgactgcgtt 
gtgcttggag 
cctatggacg 
cgaacgcctg 
gcagaacttt 
acttcctagc 
ctgctacctt 
Sragcggtgac 
cctggtttgc 

gggtccctcg 

gacgtcggct 
ctacgaagac 
ccacattctt 
gggacggggg 
gccgcagccc 
agctgcagct 
aggttttgga 
cttccgaggt 
cggcgcccca 
cgccggcact 
gtaagtccaa 
catggcgcgg 
ccttcgcccg 
attactaccg 
gcagcggcca 
tccacagcgg 
tcgacccgcg 
aggggccaag 
tgcctgtatc 
ttcagtaaat 
gcgcgaaaac 
attatgagca 
gcggctggag 
atgatatccc 
gctattacca 
taccaggaaa 
cagatgacta 
gggcagggta 
gtgagctcct 
ccttcattca 
tctggaggca 
cccttctcgg 
aaggactcgg 
aaacacctgg 
tactttgaat 
cagggagagc 
cgggacaggg 
caagatcttt 
gggctcctat 



24960 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 

26880 

26940 

27000 

27060 

27120 

27180 

27240 

27300 

27360 

27420 

27480 

27540 

27600 

27660 

27720 

27780 

27840 

27900 

27960 

28020 

28080 

28140 

28200 
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cgccatcctg taaacgccac cgtcttcacc cgcccaagca aaccaaggcg aaccttacct 28260 

ggtactttta acatctctcc ctctgtgatt tacaacagtt tcaacccaga cggagtgagt 28320 

ctacgagaga acctctccga gctcagctac tccatcagaa aaaacaccac cctccttacc 28380 

tgccgggaac gtacgagtgc gtcaccggcc gctgcaccac acctaccgcc tgaccgtaaa. 28440 

ccagactttt tccggacaga cctcaataac tctgtttacc agaacaggag gtgagcttag 28500 

aaaaccctta gggtattagg ccaaaggcgc agctactgtg gggtttatga acaattcaag 28560 

caactctacg ggctattcta attcaggttt ctctagaatc ggggttgggg ttattctctg 28620 

tcttgtgatt ctctttattc ttatactaac gcttctctgc ctaaggctcg ccgcctgctg 28680 

tgtgcacatt tgcatttatt gtcagctttt taaacgctgg ggtcgccacc caagatgatt 28740 

aggtacataa tcctaggttt actcaccctt gcgtcagccc acggtaccac ccaaaaggtg 28800 

gattttaagg agccagcctg taatgttaca ttcgcagctg aagctaatga gtgcaccact 28860 

cttataaaat gcaccacaga acatgaaaag ctgcttattc gccacaaaaa caaaattggc 28920 

aagtatgctg tttatgctat ttggcagcca ggtgacacta cagagtataa tgttacagtt 289 80 

ttccagggta aaagtcataa aacttttatg tatacttttc cattttatga aatgtgcgac 29040 

attaccatgt acatgagcaa acagtataag ttgtggcccc cacaaaattg tgtggaaaac 29100 

actggcactt tctgctgcac tgctatgcta attacagtgc tcgctttggt ctgtacccta 29160 

ctctatatta aatacaaaag cagacgcagc tttattgagg aaaagaaaat gccttaattt 29220 

actaagttac aaagctaatg tcaccactaa ctgctttact cgctgcttgc aaaacaaatt 29280 

caaaaagtta gcattataat tagaatagga tttaaacccc ccggtcattt cctgctcaat 29340 

accattcccc tgaacaattg actctatgtg ggatatgctc cagcgctaca accttgaagt . 29400 

caggcttcct ggatgtcagc atctgacttt ggccagcacc tgtcccgcgg atttgttcca 29460 

gtccaactac agcgacccac cctaacagag atgaccaaca caaccaacgc ggccgccgct 29520 

accggactta catctaccac aaatacaccc caagtttctg cctttgtcaa taactgggat . 29580 

aacttgggca tgtggtggtt ctccatagcg cttatgtttg tatgccttat tattatgtgg 29640 

ctcatctgct gcctaaagcg caaacgcgcc cgaccaccca tctatagtcc catcattgtg 29700 

ctacacccaa acaatgatgg aatccataga ttggacggac tgaaacacat gttcttttct 297 60 

cttacagtat gattaaatga gacatgattc ctcgagtttt tatattactg acccttgttg 29820 

cgcttttttg tgcgtgctcc acattggctg cggtttctca catcgaagta gactgcattc 29880 

cagccttcac agtctatttg ctttacggat ttgtcaccct cacgctcatc tgcagcctca . 29940 

tcactgtggt catcgccttt atccagtgca ttgactgggt ctgtgtgcgc tttgcatatc 30000 

tcagacacca tccccagtac agggacagga ctatagctga gcttcttaga attctttaat 30060 

tatgaaattt actgtgactt ttctgctgat tatttgcacc ctatctgcgt tttgttcccc 30120 

gacctccaag cctcaaagac atatatcatg cagattcact cgtatatgga atattccaag 30180 

ttgctacaat gaaaaaagcg atctttccga agcctggtta tatgcaatca tctctgttat 30240 

ggtgttctgc agtaccatct tagccctagc tatatatccc taccttgaca ttggctggaa 30300 

acgaatagat gccatgaacc acccaacttt ccccgcgccc gctatgcttc cactgcaaca 30360 

agttgttgcc ggcggctttg tcccagccaa tcagcctcgc cccacttctc ccacccccac 30420 

tgaaatcagc tactttaatc taacaggagg agatgactga caccctagat ctagaaatgg 30480 

acggaattat tacagagcag cgcctgctag aaagacgcag ggcagcggcc gagcaacagc 30540 

gcatgaatca agagctccaa gacatggtta acttgcacca gtgcaaaagg ggtatctttt 30600 

gtctggtaaa gcaggccaaa gtcacctacg acagtaatac caccggacac cgccttagct 30660 

acaagttgcc aaccaagcgt cagaaattgg tggtcatggt gggagaaaag cccattacca 30720 

taactcagca ctcggtagaa accgaaggct gcattcactc accttgtcaa ggacctgagg 30780 

atctctgcac ccttattaag accctgtgcg gtctcaaaga tcttattccc tttaactaat 30840 

aaaaaaaaat aataaagcat cacttactta aaatcagtta gcaaatttct gtccagttta 30900 

ttcagcagca cctccttgcc ctcctcccag ctctggtatt gcagcttcct cctggctgca 30960 

aactttctcc acaatctaaa tggaatgtca gtttcctcct gttcctgtcc atccgcaccc 31020 

actatcttca tgttgttgca gatgaagcgc gcaagaccgt ctgaagatac cttcaacccc 31080 

gtgtatccat atgacacgga aaccggtcct ccaactgtgc cttttcttac tcctcccttt 31140 

gtatccccca atgggtttca agagagtccc cctggggtac tctctttgcg cctatccgaa 31200 

cctctagtta cctccaatgg catgcttgcg ctcaaaatgg gcaacggcct ctctctggac 31260 

gaggccggca accttacctc ccaaaatgta accactgtga gcccacctct caaaaaaacc 31320 

aagtcaaaca taaacctgga aatatctgca cccctcacag ttacctcaga agccctaact 31380 

gtggctgccg ccgcacctct aatggtcgcg ggcaacacac tcaccatgca atcacaggcc 31440 

ccgctaaccg tgcacgactc caaacttagc attgccaccc aaggacccct cacagtgtca 31500 
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gaaggaaagc 
actatcactg 
gagcccattt 
. acagacgacc 
tccttgcaaa 
aatgtagcag 
tatccgtttg 
aactcagccc 
aacaattcca 
acagccatag 
acaaatcccc 
gttcctaaac 
aaaaataatg 
. aatgcagaga 
gctacagttt 
agtgctcatc 
gacccagaat 
gctgttggat 
agtaacattg 
attacactaa 
ttttcatggg 
actttttcat 
ttttcaattg 
tagcttatac 
acctccctcc 
catatcatgg 
caaacgctca 
gtccagctgc 
agaagtccac 
ctgcagcagc 
ggcagtggtc 
ggcacagcag 
aatattgttc 
agaacccacg 
cacgctggac 
tataaacctc 
ctgcccgccg 
ggactcgtaa 
cacgtgcata 
aacaacccat 
cacgttgtgc 
agcgcgggtt 
caaccgagat 
tcctgaagca 
tagatcgctc 
tggcttcggg 
cagaataagc 
cgggaagagc 
aaatgaagat 
aaagaacaga 
ctcacgtcca 
ccagcacctt 
agcaaatccc 
ttcagcctca 
gaCtcaaaag 



tagccctgca 
cctcaccccc 
atacacaaaa 
taaacacttt 
ctaaagttac 
gaggactaag 
atgctcaaaa 
acaacttgga 
aaaagcttga 
ccattaatgc 
tcaaaacaaa 
taggaactgg 
ataagctaac 
aagatgctaa 
cagttttggc 
ttattataag 
attggaactt 
ttatgcctaa 
tcagtcaagt 
acggtacaca 
actggtctgg 
acattgccca 
cagaaaattt 
agatcaccgt 
caacacacag 
gtaacagaca 
tcagtgatat 
tgagccacag 
gcctacatgg 
gcgcgaataa 
tcctcagcga 
cgcaccctga 
aaaatcccac 
tggccatcat 
ataaacatta 
tgattaaaca 
gctatacact 
ccatggatca 
cacttcctca 
tcctgaatca 
attgtcaaag 
tctgtctcaa 
cgtgttggtc 
aaaccaggtg 
tgtgtagtag 
ttctatgtaa 
cacacccagc 
tggaagaacc 
ctattaagtg 
taatggcatt 
agtggacgta 
caaccatgcc 
gaatattaag 
agcagcgaat 
cggaacatta 



aacatcaggc 
tctaactact 
tggaaaacta 
gaccgtagca 
tggagccttg 
gattgattct 
ccaactaaat 
tattaactac 
ggttaaccta 
aggagatggg 
aattggccat 
ccttagtttt 
tttgtggacc 
actcactttg 
tgttaaaggc 
atttgacgaa 
tagaaatgga 
cctatcagct 
ttacttaaac 
ggaaacagga 
ccacaactac 
agaataaaga 
caagtcattt 
accttaatca 
agtacacagt 
tattcttagg 
taataaactc 
gctgctgtcc 
gggtagagtc 
actgctgccg 
tgattcgcac 
tctcacttaa 
agtgcaaggc 
accacaagcg 
cctcttttgg 
tggcgccatc 
gcagggaacc 
tcatgctcgt 
ggattacaag 
gcgtaaatcc 
tgttacattc 
aaggaggtag 
gtagtgtcat 
cgggcgtgac 
ttgtagtata 
actccttcat 



caacctacac 
atgttttttt 
aacgcgctcc 
tgtaagatgt 
aaggctaaac 
caaataattc 
tccggccatt 
catgattgca 
acaaaaatac 



cccctcacca 
gccactggta 
ggactaaagt 
actggtccag 
ggttttgatt 
caaaacagac 
ctaagactag 
aacaaaggcc 
agcactgcca 
cttgaatttg 
ggcctagaat 
gacagcacag 
acaccagctc 
gtcttaacaa 
agtttggctc 
aatggagtgc 
gatcttactg 
tatccaaaat 
ggagacaaaa 
gacacaactc 
attaatgaaa 
atcgtttgtg 
ttcattcagt 
aactcacaga 
cctttctccc 
tgttatattc 
cccgggcagc 
aacttgcggt 
ataatcgtgc 
ccgccgctcc 
cgcccgcagc 
atcagcacag 
gctgtatcca 
caggtagatt 
catgttgtaa 
caccaccatc 
gggactggaa 
catgatatca 
ctcctcccgc 
cacactgcag 
gggcagcagc 
acgatcccta 
gccaaatgga 
aaacagatct 
tccactctct 
gcgccgctgc 
attcgttctg 
ttttattcca 
cctccggtgg 
tgcacaatgg 
ccttcagggt 
tcatctcgcc 
gtaaaaatct 
aaaattcagg 
cgcgatcccg 



ccaccgatag 
gcttgggcat 
acggggctcc 
gtgtgactat 
cacaaggcaa 
gccttatact 
gacagggccc 
tttacttgtt 
aggggttgat 
gttcacctaa 
ttgattcaaa 
gtgccattac 
catctcctaa 
aatgtggcag 
caatatctgg 
tactaaacaa 
aaggcacagc 
ctcacggtaa 
ctaaacctgt 
caagtgcata 
tatttgccac 
ttatgtttca 
agtatagccc 
accctagtat 
cggctggcct 
cacacggttt 
tcacttaagt 
tgcttaacgg 
atcaggatag 
gtcctgcagg 
ataaggcgcc 
taactgcagc 
aagctcatgg 
aagtggcgac 
ttcaccacct 
ctaaaccagc 
caatgacagt 
atgttggcac 
gttagaacca 
ggaagacctc 
ggatgatcct 
ctgtacggag 
acgccggacg 
gcgtctccgg 
caaagcatcc 
cctgataaca 
cgagtcacac 
aaagattatc 
cgtggtcaaa 
cttccaaaag 
gaatctcctc 
accttctcaa 
gctccagagc 
ttcctcacag 
taggtccctt 



cagtaccctt 
tgacttgaaa 
tttgcatgta 
taataatact 
tatgcaactt 
tgatgttagt 
tctttttata 
tacagcttca 
gtttgacgct 
tgcaccaaac 
caaggctatg 
agtaggaaac 
ctgtagacta 
tcaaatactt 
aacagttcaa 
ttccttcctg 
ctatacaaac 
aactgccaaa 
aacactaacc 
ctctatgtca 
atcctcttac 
acgtgtttat 
caccaccaca 
tcaacctgcc 
taaaaagcat 
cctgtcgagc 
tcatgtcgct 
gcggcgaagg 
ggcggtggtg 
aatacaacat 
ttgtcctccg 
acagcaccac 

cggggaccac 

ccctcataaa 
cccggtacca 
tggccaaaac 



ggagagccca 
aacacaggca 
tatcccaggg 
gcacgtaact 
ccagtatggt 
tgcgccgaga 
tagtcatatt 
tctcgccgct 
aggcgccccc 
tccaccaccg 
acgggaggag 
caaaacctca 
ctctacagcc 
gcaaacggcc 
tataaacatt 
tatatctcta 
gccctccacc 
acctgtataa 
cgcagggcca 



31560 

31620 

31680 

31740 

31800 

31860 

31920 

31980 

32040 

32100 

32160 

32220 

32280 

32340 

32400 

32460 

32520 

32580 

32640 

32700 

32760 

32820 

32880 

32940 

33000 

33060 

33120 

33180 

33240 

33300 

33360 

33420 

33480 

33540 

33600 

33660 

33720 

33780 

33840 

33900 

33960 

34020 

34080 

34140 

34200 

34260 

34320 

34380 

34440 

34500 

34560 

34620 

34680 

34740 

34800 
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gctgaacata 
tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaat 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



atcgtgcagg 
acccacactg 
agctttgttg 
cctcgcgcaa 
ccggaaccac 
taaacacaaa 
aaaaacaacc 
ctggtcaccg 
gtaagactcg 
agcccggggg 
taacaaaatt 
aaatagcacc 
agccttacca 
aatcagtcac 
gacgtaacgg 
aaacgaaagc 
gtaacttccc 
acctacgtca 
tatcatattg 



tctgcacgga 
attatgacac 
catgggcggc 
aaaagaaagc 
cacagaaaaa 
ataaaataac 
cttataagca 
tgattaaaaa 
gtaaacacat 
aatacatacc 
aataggagag 
ctcccgctcc 
gtaaaaaaga 
agtgtaaaaa 
ttaaagtcca 
caaaaaaccc 
attttaagaa 
cccgccccgt 
gcttcaatcc 



ccagcgcggc 
gcatactcgg 
gatataaaat 
acatcgtagt 
gacaccattt 
aaaaaaacat 
taagacggac 
gcaccaccga 
caggttgatt 
cgcaggcgta 
aaaaacacat 
agaacaacat 
aaacctatta 
agggccaagt 
caaaaaacac 
acaacttcct 
aactacaatt 
tcccacgccc 
aaaataaggt 



cacttccccg 
agctatgcta 
gcaaggtgct 
catgctcatg 
ttctctcaaa 
ttaaacatta 
tacggccatg 
cagctcctcg 
catcggtcag 
gagacaacat 
aaacacctga 
acagcgcttc 
aaaaaacacc 
gcagagcgag 
ccagaaaacc 
caaatcgtca 
cccaacacat 
cgcgccacgt 
atattattga 



ccaggaacct 
accagcgtag 
gctcaaaaaa 
cagataaagg 
catgtctgcg 
gaagcctgtc 
ccggcgtgac 
gtcatgtccg 
tgctaaaaag 
tacagccccc 
aaaaccctcc 
acagcggcag 
actcgacacg 
tatatatagg 
gcacgcgaac 
cttccgtttt 
acaagttact 
cacaaactcc 
tgatg 



34860 : 

34920 

34980 

35040 

35100 

35160 

35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35935 



<210> 9 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 5 



<400> 9 

catcatcaat 

ttgtgacgtg 

gatgttgcaa 

gtgtgcgccg 

taaattCggg 

agtgaaatct 

gactttgacc 

cgggtcaaag 

tgagttcctc 

tccgacaccg 

aatggccgcc 

tcctagccat 

cgaagatccc 

gcaggaaggg 

cctttcccgg 

ccttgtaccg 

cgaggatgaa 

caggtcttgt 

ctatatgagg 

tagagtggtg 

gaattttgta 

ccagaaccgg 

cgcccgacat 

ccttctaaca 

gccgtgagag 

cctgggcaac 

ttgcgtgtgt 

gagataatgt 

cgccgtgggc 



aatatacctt 
gcgcggggcg 
gtgtggcgga 
gtgtacacag 
cgtaaccgag 
gaataatttt 
gtttacgtgg 
ttggcgtttt 
aagaggccac 
ggactgaaaa 
agtcttttgg 
tttgaaccac 
aacgaggagg 
attgacttac 
cagcccgagc 
gaggtgatcg 
gagggtgagg 
cattaccacc 
acctgtggca 
ggtttggtgt 
ttgtgatttt 
agcctgcaag 
cacctgtgtc 
cacctcctga 
ttggtgggcg 
ctttggactt 
ggttaacgcc 
ttaacttgca 
taatcttggt 



attttggatt 
tgggaacggg 
acacatgtaa 
gaagtgacaa 
taagatttgg 
gtgttactca 
agactcgccc 
attattatag 
tcttgagtgc 
tgagacatat 
accagctgat 
ctacccttca 
cggtttcgca 
tcacttttcc 
agccggagca 
atcttacctg 
agtttgtgtt 
ggaggaatac 
tgtttgtcta 
ggtaattttt 
tttaaaaggt 
acctacccgc 
tagagaatgc 
gatacacccg 
tcgccaggct 
gagctgtaaa 
tttgtttgct 
tggcgtgtta 
tacatctgac 



gaagccaata 
gcgggtgacg 
gcgacggatg 
ttttcgcgcg 
ccattttcgc 
tagcgcgtaa 
aggtgttttt 
tcagctgacg 
cagcgagtag 
tatctgccac 
cgaagaggta 
cgaactgtat 
gatttttccc 
gccggcgccc 
gagagccttg 
ccacgaggct 
agattatgtg 

gggggaccca 

cagtaagtga 
tttttaattt 
cctgtgtctg 
cgtcctaaaa 
aatagtagta 
gtggtcccgc 
gtggaatgta 
cgccccaggc 
gaatgagttg 
aatggggcgg 
ctcatggagg 



tgataatgag 
tagtagtgtg 
tggcaaaagt 
gttttaggcg 
gggaaaactg 
tatttgtcta 
ctcaggtgtt 
tgtagtgtat 
agttttctcc 
ggaggtgtta 
ctggctgata 
gatttagacg 
gactctgtaa 
ggttctccgg 
ggtccggttt 
ggctttccac 
gagcaccccg 
gatattatgt 
aaattatggg 
ttacagtttt 
aacctgagcc 
tggcgcctgc 
cggatagctg 
tgtgccccat 
tcgaggactt 
cataaggtgt 
atgtaagttt 
ggcttaaagg 
cttgggagtg 



ggggtggagt 
gcggaagtgt 
gacgtttttg 
gatgttgtag 
aataagagga 
gggccgcggg 
ttccgcgttc 
ttatacccgg 
tccgagccgc 
ttaccgaaga 
atcttccacc 
tgacggcccc 
tgttggcggt 
agccgcctca 
ctatgccaaa 
ccagtgacga 
ggcacggttg 
gttcgctttg 
cagtgggtga 
gtggtttaaa 
tgagcccgag 
tatcctgaga 
tgactccggt 
taaaccagtt 
gcttaacgag 
aaacctgtga 
aataaagggt 
gtatataatg 
tttggaagat 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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ttttctgctg 
tttctgtggg 
gaatttgaag 
caggcgcttt 
srcggctgctg 
. agcggggggt 
. aagaatcgcc 
cagcagcagc 
gccggcctgg 
/ gacgcatttt 
gggcttgtga 
gtcctgagtg 
tggcgcagaa 
ttgaggaggc 
tcagcaaact 
agatagatac 
tgcttggcat 
, . gtacggtttt 
• acaatacctg 
gctggaaggg 
aaaggtgtac 
ccgactgtgg 
gtggcaactg 
tgctgaagac 
acaitactgac 
' aatgcaattt 
tgaacggggt 
gcaccaggtg 
tggatgtgac 
ttggctctag 
tgggaaagaa 
ccgccgccat 
gcatgccccc 
ccgtcctgcc 
agactgcagc 
actttgcttt 
acaagttgac 
ctcagcagct 
atgcggttta 
cttgctgtct 
cgttgagggt 
acatgggcat 
Sggtggtgtt 
ctttcagtag 
taagctggga 
tggctatgtt 
tgtatccggt 
tggagacgcc 
gcccacgggc 
ccaggatgag 
gtataatggt 
ctttgagttc 
gggtagggga 
cggtgggccc 
tgccgtcatc 



tgcgtaactt 
gctcatccca 
agcttttgaa 
tccaagagaa 
ttgctttttt 
acctgctgga 
tgctactgtt 
aggaggaagc 
accctcggga 
gacaattaca 
ggctacagag 
tattactttt 
gtattccata 
tattagggta 
tgtaaatatc 
ggaggatagg 
Srgacggggtg 
cctggccaat 
tgtggaagcc 
ggtggtgtgt 
cttgggtatc 
ttgcttcatg 
cgaggacagg 
cattcacgta 
ccgctgttcc 
gagtcacact 
gtttgacatg 
cagaccctgc 
cgaggagctg 
cgatgaagat 
tatataaggt 
gagcaccaac 
atgggccggg 
cgcaaactct 
ctccgccgcc 
cctgagcccg 
ggctcttttg 
gttggatctg 
aaacataaat 
ttatttaggg 
cctgtgtatt. 
aagcccgtct 
gtagatgatc 
caagctgatt 
tgggtgcata 
cccagccata 
gcacttggga 
cttgtgacct 
ggcggcctgg 
atcgtcatag 
tccatccggc 
agatgggggg 
gatcagctgg 
gtaaatcaca 
cctgagcagg 



gctggaacag 
ggcaaagtta 
atcctgtggt 
ggtcatcaag 
gagttttata 
ttttctggcc 
gtcttccgtc 
caggcggcgg 
atgaatgttg 
gaggatgggc 
gaggctagga 
caacagatca 
gagcagctga 
tatgcaaagg 
aggaattgtt 
gtggccttta 
gttattatga 
accaacctta 
tggaccgatg 
cgccccaaaa 
ctgtctgagg 
ctagtgaaaa 
gcctctcaga 
gccagccact 
ttgcatttgg 
aagatattgc 
accatgaaga 
gagtgtggcg 
aggcccgatc 
acagattgag 

gggggtctta 

tcgtttgatg 
gtgcgtcaga 
actaccttga 
gcttcagccg 
cttgcaagca 
gcacaattgg 
cgccagcagg 
aaaaaaccag 
gttttgcgcg 
ttttccagga 
ctggggtgga 
cagtcgtagc 
gccaggggca 
cgtggggata 
tccctccggg 
aatttgtcat 
ccaagatttt 
gcgaagatat 
gccattttta 
ccaggggcgt 
atcatgtcta 
gaagaaagca 
cctattaccg 
grgggccactt 



agctctaaca 
gtctgcagaa 
gagctgtttg 
actttggatt 
aaggataaat 
atgcatctgt 
cgcccggcga 
cggcaggagc 
tacaggtggc 
aggggctaaa 
atctagcttt 
aggataattg 
ccacttactg 
tggcacttag 
gctacatttc 
gatgtagcat 
atgtaaggtt 
tcctacacgg 
taagggttcg 
gcagggcttc 
gtaactccag 
gcgtggctgt 
tgctgacctg 
ctcgcaaggc 
gtaacaggag 
ttgagcccga 
tctggaaggt 
gtaaacatat 
acttggtgct 
gtactgaaat 
tgtagttttg 
gaagcattgt 
atgtgatggg 
cctacgagac 
ctgcagccac 
gtgcagcttc 
attctttgac 
tttctgccct 
actctgtttg 
cgcggtaggc 
cgtggtaaag 
ggtagcacca 
aggagcgctg 
ggcccttggt 
tgagatgcat 
gattcatgtt 
gtagcttaga 
ccatgcattc 
ttctgggatc 
caaagcgcgg 
agttaccctc 
cctgcggggc 
ggttcctgag 
ggtgcaactg 
cgttaagcat 



gtacctcttg 
ttaaggagga 
attctttgaa 
tttccacacc 
ggagcgaaga 
ggagagcggt 
taataccgac 
agagcccatg 
tgaactgtat 

gggggtaaag 

tagcttaatg 
cgctaatgag 
gctgcagcca 
gccagattgc 
tgggaacggg 
gataaatatg 
tactggcccc 
tgtaagcttc 
gggctgtgcc 
aattaagaaa 
ggtgcgccac 
gattaagcat 
ctcggacggc 
ctggccagtg 

gggggtgttc 
gagcatgtcc 
gctgaggtac 
taggaaccag 
ggcctgcacc 
gtgtgggcgt 
tatctgtttt 
gagctcatat 
ctccagcatt 
cgtgtctgga 
cgcccgcggg 
ccgttcatcc 



ccgggaactt 
gaaggcttcc 
gatttggatc 
ccgggaccag 
gtgactctgg 
ctgcagagct 
ggcgtggtgc 
gtaagtgttt 
cttggactgt 
gtgcagaacc 
aggaaatgcg 
gtccataatg 
actaacgtca 
gcggagggtg 
acagatttgc 
gatgaagaaa 
cagctgcgac 
gtagttaaga 
gtccctgact 



gttttggagg 
ttacaagtgg 
tctgggtcac 

ggggcgcgct 
aacccatctg 
tgtgagacac 
ggaggagcag 
gaacccgaga 
ccagaactga 
agggagcggg 
accagacacc 
cttgatctgc 

ggggatgatt 

aagtacaaga 
gccgaggtgg 
tggccggggg 
aattttagcg 
tatgggttta 
ttttactgct 
tgcctctttg 
aatgtggcct 
aacatggtat 
aactgtcacc 
tttgagcata 
ctaccttacc 
aaggtgaacc 
gatgagaccc 
cctgtgatgc 
cgcgctgagt 
ggcttaaggg 
gcagcagccg 
ttgacaacgc 
gatggtcgcc 
acgccgttgg 
attgtgactg 
gcccgcgatg 
aatgtcgttt 
tcccctccca 
aagcaagtgt 
cggtctcggt 
atgttcagat 
tcatgctgcg 
ctaaaaatgt 
acaaagcggt 
atttttaggt 
accagcacag 
tggaagaact 
atggcaatgg 
tagttgtgtt 
ccagactgcg 
atttcccacg 
acggtttccg 
ttaccgcagc 
gagctgcagc 
cgcatgtttt 



1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
, 3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
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ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag . 5100 
caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa . 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 5760 

•cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 63 60 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 
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.; aacatcgcgc 
'[ gagctcctgc 
■ cctaatttcc 
cggcgcgact 
atctaaaagc 
agagggggca 
ttgctggcga 
acgacgggcc 
ttgacggcgg 
tcggccatga 
gtggcggcga 
tcgttccaga 
tgcgcgagat 
aggtagttga 
aacgtggatt 
acggcgaagt 
cggatgagct 
tcttcttcaa 



Srgagggggga 
atctccccgc 
agttggaaga 
agggatacgg 
gacctgagcg 
tcacagtcgc 
tttctggcgg 
gtcgacagaa 
ccccaggctt 
accggcactt 
gcggcggagt 
ctcatcggct 
acctgcgtga 
ttgatggtgt 
gagagctcgg 
gtccgcacca 
cagcgtaggg 
tagatgtacc 
cggacgcggt 
ccggtcaggc 
ggcactcttc 
tcgagccccg 
caggtgtgcg 
gctgctgcgc 
gcgaaagcat 
gcgggacccc 
ccgtcatgca 
ttttcccaga 
caagagcagc 
acatccgcgg 
cactacctgg 
cggtacccaa 
ctgtttcgcg 
gggcgcgagc 
cccgacgcgc 
accgcatacg 
gtgcgtacgc 



agatgggagc 
aggtttacct 
aggggctggt 
acggtaccgc 
gfirtgacgcgg 

srgggcacgtc 

acgcgacgac 
cggtgagctt 
cctggcgcaa 
actgctcgat 
ggtcgttgga 
cgcggctgta 
tgagctccac 
gggtggtggc 
cgttgatatc 
tgaaaaactg 
cggcgacagt 
tctcctcttc 



cacggcggcg 
ggcgacggcg 
cgccgcccgt 
cgctaacgat 
agtccgcatc 
aaggtaggct 
aggtgctgct 
gcaccatgtc 
cgttttgaca 
cttcttctcc 
ttggccgtag 
gaagcagggc 
gggtagactg 
aagtgcagtt 
tgtacctgag 
ggtactggta 
tggccggggc 
tggacatcca 
tccagatgtt 
gcgcgcaatc 
cgtggtctgg 
tatccggccg 
acgtcagaca 
tagctttttt 
taagtggctc 
cggttcgagt 
agaccccgct 
tgcatccggt 
ggcagacatg 
ttgacgcggc 
acttggagga 
gggtgcagct 
accgcgaggg 
tgcggcatgg 
gaaccgggat 
agcagacggt 
ttgtggcgcg 



tgtccatggt 
cgcatagacg 
tggtggcggc 
gcggcgggcg 
gcgagccccc 
ggcgccgcgc 
gcggcggttg 
gagcctgaaa 
aatctcctgc 
ctcttcctcc 
aatgcgggcc 
gaccacgccc 
gtgccgggcg 

ggtgtgttct 

ccccaaggcc 
ggagttgcgc 
gtcgcgcacc 
cataagggcc 
acgacggcgc 
catggtctcg 
catgtcccgg 
gcatctcaac 
gaccggatcg 
gagcaccgtg 
gatgatgtaa 
cttgggtccg 
tcggcgcagg 
ttcctcttgt 
gtggcgccct 
taggtcggcg 
gaagtcatcc 
ggccataacg 
acgcgagtaa 
tcccaccaaa 
tccgggggcg 
ggtgatgccg 
gcgcagcggc 
gttgacgctc 
tggataaatt 
tccgccgtga 
acgggggagt 
ggccactggc 
gctccctgta 
ctcggaccgg 
tgcaaattcc 
gctgcggcag 
cagggcaccc 
agcagatggt 
gggcgagggc 
gaagcgtgat 
agaggagccc 
cctgaatcgc 
tagtcccgcg 
gaaccaggag 
cgaggaggtg 



ctggagctcc 
ggtcagggcg 
gtcgatggct 
gtgggccgcg 
ggaggtaggg 
gcgggcagga 
atctcctgaa 
gagagttcga 
acgtctcctg 
tggagatctc 
atgagctgcg 
ccttcggcat 
aagacggcgt 
gccacgaaga 
tcaaggcgct 
gccgacacgg 
tcgcgctcaa 
tccccttctt 



accgggaggc 
gtgacggcgc 
ttatgggttg 
aattgttgtg 
gaaaacctct 
gcgggcggca 
ttaaagtagg 
gcctgctgaa 
tctttgtagt 
cctgcatctc 
cttcctccca 
acaacgcgct 
atgtccacaa 
gaccagttaa 
gccctcgagt 
aagtgcggcg 
agatcttcca 
gcggcggtgg 
aaaaagtgct 
tagaccgtgc 
cgcaagggta 
tccatgcggt 
gctccttttg 
cgcgcgcagc 
gccggagggt 
ccggactgcg 
tccggaaaca 
atgcgccccc 
tcccctcctc 
gattacgaac 
ctggcgcggc 
acgcgtgagg 
gaggagatgc 
gagcggttgc 
cgcgcacacg 
attaactttc 
gctataggac 



cgcggcgtca 
cgggctagat 
tgcaagaggc 

ggggtgtcct 

ggggctccgg 
gctggtgctg 
tctggcgcct 
cagaatcaat 
agttgtcttg 
cgcgtccggc 
agaaggcgtt 
cgcgggcgcg 
agtttcgcag 
agtacataac 
ccatggcctc 
ttaactcctc 
aggctacagg 
cttcttctgg 
ggtcgacaaa 
ggccgttctc 
gcggggggct 
taggtactcc 
cgagaaaggc 
gcgggcggcg 
cggtcttgag 
tgcgcaggcg 
agtcttgcat 
ttgcatctat 
tgcgtgtgac 
cggctaatat 
agcggtggta 
cggtctggtg 
caaatacgta 
gcggctggcg 
acataaggcg 
tggaggcgcg 
ccatggtcgg 
aaaaggagag 
tcatggcgga 
taccgcccgc 
gcttccttcc 
gtaagcggtt 
tattttccaa 



gcgaacgggg 
gggacgagcc 
ctcctcagca 
ctaccgcgtc 
ccccgcggcg 
taggagcgcc 
cgtacgtgcc 
gggatcgaaa 
tgcgcgagga 
tggcggccgc 
aaaaaagctt 
tgatgcatct 



ggtcaggcgg 
ccaggtgata 
cgcatccccg 
tggatgatgc 
acccgccggg 
cgcgcgtagg 
ctgcgtgaag 
ttcggtgtcg 
ataggcgatc 
tcgctccacg 
gaggcctccc 
catgaccacc 
gcgctgaaag 
ccagcgtcgc 
gtagaagtcc 
ctccagaaga 
ggcctcttct 
cggcggtggg 
gcgctcgatc 
gcgggggcgc 
gccatgcggc 
gccgccgagg 
gtctaaccag 
gtcggggttg 
acggcggatg 
gtcggccatg 
gagcctttct 
cgctgcggcg 
cccgaagccc 
ggcctgctgc 
tgcgcccgtg 
acccggctgc 
gtcgttgcaa 
gtagaggggc 
atgatatccg 
cggaaagtcg 
gacgctctgg 
cctgtaagcg 
cgaccggggt 
gtgtcgaacc 
aggcgcggcg 
aggctggaaa 

gggttgagtc 

gtttgcctcc 
ccttttttgc 
gcggcaagag 
aggaggggcg 
ccgggcccgg 
ctctcctgag 
gcggcagaac 
gttccacgca 
ggactttgag 
cgacctggta 
taacaaccac 
gtgggacttt 



8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
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gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 

gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

. ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 . 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaa.ttct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcacfcggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680. 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

cbagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctt gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 
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caagcggcaa 
cattcccgca 
gggcgggggt 
cgcggcagcc 
cacctttgcc 
• cgcccccgct 
gacagaggac 
gtaccgcagc 
gaccctgctt 
agacatgatg 
ggtgggcgcc 
ctcccaactc 
ccagattttg 
tctcacagat 
cattactgac 
gccgcgcgtc 
caataacaca 
ctccgaccaa 
acgcggccgc 
gcgcaactac 
ggtgcgcgga 
ccaccgccgc 
acgtcgcacc 
cactgtgccc 
tatgactcag 
cgtgcccgtg 
gtactgttgt 
caaagaagag 
gcaggattac 
tgaacttgac 
gaaaggtcga 
tgagcgctcc 
gcttgagcag 
gctggcgttg 
gcaggtgctg 
tgacttggca 
'ggaaaaaatg 
ggtggcgccg 
cagtattgcc 
ggcggatgcc 
aacggacccg 
cggcgccgcc 
cggctatcgt 
cactggaacc 
cagggtggct 
catcgtttaa 
tttcccggtg 
cctgacgggc 
gcgcggcggt 
cggaattgca 
gaaaaatcaa 
gaatggaaga 
gaaactggca 
tgtggagcgg 
acagcagcac 



cccttccagg 
ctgttggatg 
ggcgcaggcg 
gcggcaatgc 
acacgggctg 
gcgcaacccg 
agcaagaaac 
tggtaccttg 
tgcactcctg 
caagaccccg 
gagctgttgc 
atccgccagt 
gcgcgcccgc 
cacgggacgc 
gccagacgcc 
ctatcgagcc 
ggctggggcc 
cacccagtgc 
actgggcgca 
acgcccacgc 
gcccggcgct 
cgacccggca 
ggccgacggg 
cccaggtcca 
ggtcgcaggg 
cgcacccgcc 
atgtatccag 
atgctccagg 
aagccccgaa 
gacgaggtgg 
cgcgtaaaac 
acccgcacct 
gccaacgagc 
ccgctggacg 
cccgcgcttg 
cccaccgtgc 
accgtggaac 
ggactgggcg 
accgccacag 
gcggtgcagg 
tggatgtttc 
agcgcgctac 
ggctacacct 
cgccgccgcc 
cgcgaaggag 
aagccggtct 
ccgggattcc 
ggcatgcgtc 
atcctgcccc 
tccgtggcct 
aataaaaagt 
catcaacttt 
agatatcggc 
cattaaaaat 
aggccagatg 



agggctttag 
tggacgccta 
gcagcaacag 
agccggtgga 
aggagaagcg 
aggtcgagaa 
gcagttacaa 
catacaacta 
acgtaacctg 
tgaccttccg 
ccgtgcactc 
ttacctctct 
cagcccccac 
taccgctgcg 
gcacctgccc 
gcactttttg 
tgcgcttccc 
gcgtgcgcgg 
ccaccgtcga 
cgccaccagt 
atgctaaaat 
ctgccgccca 
cggccatgcg 
ggcgacgagc 
gcaacgtgta 
ccccgcgcaa 
cggcggcggc 
tcatcgcgcc 
agctaaagcg 
aactgctgca 
gtgttttgcg 
acaagcgcgt 
gcctcgggga 
agggcaaccc 
caccgtccga 
agctgatggt 
ctgggctgga 
tgcagaccgt 
agggcatgga 
cggtcgctgc 
gcgtttcagc 
tgcccgaata 
accgccccag 
gtcgccgtcg 
gcaggaccct 
ttgtggttct 
gaggaagaat 
gtgcgcacca 
tccttattcc 
tgcaggcgca 
ctggactctc 
gcgtctctgg 
accagcaata 
ttcggttcca 
ctgagggata 



gatcacctac 
ccaggcgagc 
cagtggcagc 
ggacatgaac 
cgctgaggcc 
gcctcagaag 
cctaataagc 
cggcgaccct 
cggctcggag 
ctccacgcgc 
caagagcttc 
gacccacgtg 
catcaccacc 
caacagcatc 
ctacgtttac 
agcaagcatg 
aagcaagatg 
gcactaccgc 
tgacgccatc 
gtccacagtg 
gaagagacgg 
acgcgcggcg 
ggccgctcga 
ggccgccgca 
ttgggtgcgc 
ctagattgca 
gcgcaacgaa 
ggagatctat 
ggtcaaaaag 
cgctaccgcg 
acccggcacc 
gtatgatgag 
gtttgcctac 
aacacctagc 
agaaaagcgc 
acccaagcgc 
gcccgaggtc 
ggacgttcag 
gacacaaacg 
ggccgcgtcc 
cccccggcgc 
tgccctacat 
aagacgagca 
ccagcccgtg 
ggtgctgcca 
tgcagatatg 
gcaccgtagg 
ccggcggcgg 
actgatcgcc 
gagacactga 
acgctcgctt 
ccccgcgaca 
tgagcggtgg 
ccgttaagaa 
agttgaaaga 



gatgatctgg 
ttgaaagatg 
ggcgcggaag 
gatcatgcca 
gaagcagcgg 
aaaccggtga 
aatgacagca 
cagaccggaa 
caggtctact 
cagatcagca 
tacaacgacc 
ttcaatcgct 
gtcagtgaaa 
ggaggagtcc 
aaggccctgg 
tccatcctta 
tttggcgggg 
gcgccctggg 
gacgcggtgg 
gacgcggcca 
cggaggcgcg 
gcggccctgc 
aggctggccg 
gcagccgcgg 
gactcggtta 
agaaaaaact 
gctatgtcca 
ggccccccga 
aaaaagaaag 
cccaggcgac 
accgtagtct 
gtgtacggcg 
ggaaagcggc 
ctaaagcccg 
ggcctaaagc 
cagcgactgg 
cgcgtgcggc 
atacccacta 
tccccggttg 
aagacctcta 
ccgcgcggtt 
ccttccattg 
actacccgac 
ctggccccga 
acagcgcgct 
gccctcacct 
aggggcatgg 
cgcgcgtcgc 
gcggcgattg 
ttaaaaacaa 
ggtcctgtaa 
cggctcgcgc 
cgccttcagc 
ctatggcagc 
gcaaaatttc 



agggtggtaa 
acaccgaaca 
agaactccaa 
ttcgcggcga 
ccgaagctgc 
tcaaacccct 
ccttcaccca 
tccgctcatg 
ggtcgttgcc 
actttccggt 
aggccgtcta 
ttcccgagaa 
acgttcctgc 
agcgagtgac 
gcatagtctc 
tatcgcccag 
ccaagaagcg 
gcgcgcacaa 
tggaggaggc 
ttcagaccgt 
tagcacgtcg 
ttaaccgcgc 
cgggtattgt 
ccattagtgc 
gcggcctgcg 
acttagactc 
agcgcaaaat 
agaaggaaga 
atgatgatga 
gggtacagtg 
ttacgcccgg 
acgaggacct 
ataaggacat 
taacactgca 
gcgagtctgg 
aagatgtctt 
caatcaagca 
ccagtagcac 
cctcagcggt 
cggaggtgca 
cgaggaagta 
cgcctacccc 
gccgaaccac 
tttccgtgcg 
accaccccag 
gccgcctccg 
ccggccacgg 
accgtcgcat 
gcgccgtgcc 
gttgcatgtg 
ctattttgta 
ccgttcatgg 
tggggctcgc 
aaggcctgga 
caacaaaagg 



15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 
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tggtagatgg 
aaaataagat 
tggagacagt 
ctctggtgac 
ccaccacccg 
cgctggacct 
ccgttgttgt 
cgttgcggcc 
gggtgcaatc 
. atgtatgcgt 
gatggctacc 
ctcggagtac 
cctgaataac 
gtcccagcgt 
caaggcgcgg 
ctttgacatc 
ctacaacgcc 
tgctcttgaa 
agctgagcag 
aaaggagggt 
tcaacctgaa 
tgggagagtc 
cacaaatgaa 
tcaagtggaai 
gactcctaaa 
ttcttacatg 
gcccaacagg 
cagcacgggt 
tttgcaagac 
aaccaggtac 
tattgaaaat 
gattaataca 
aaaagatgct 
ggaaatcaat 
' tttgcccgac 
ctacgactac 
tggagcacgc 
tgctggcctg 
"ccaggtgcct 
ctacgagtgg 
cctaagggtt 
ccccatggcc 
ccagtccttt 
taccaacgtg 
cacgcgcctt 
ctactctggc 
ggtggccatt 
caacgagttt 
catgaccaaa 
cttctatatc 
catgagccgt 
acaccaacac 
ggcctaccct 
ccagaaaaag 
gtccatgggc 



cctggcctct 

taacagtaag 

gtctccagag 

gcaaatagac 

tcccatcgcg 

gcctcccccc 

aacccgtcct 

cgtagccagt 

cctgaagcgc 

ccatgtcgcc 

ccttcgatga 

ctgagccccg 

aagtttagaa 

ttgacgctgc 

ttcaccctag 

cgcggcgtgc 

ctggctccca 

ataaacctag 

caaaaaactc 

attcaaatag 

cctcaaatag 

ctCaaaaaga 

aatggagggc 

atgcaatttt 

gtggtattgt 

cccactatta 

cctaattaca 

aatatgggtg 

agaaacacag 

ttttctatgt 

catggaactg 

gagactctta 

acagaatttt 

ctaaatgcca 

aagctaaagt 

atgaacaa'^c 

tggtcccttg 

cgctaccgct 

cagraagttct 

aacttcagga 

gacggagcca 

cacaacaccg 

aacgactatc 

cccatatcca 

aagactaagg 

tctataccct 

acctttgact 

gaaattaagc 

gactggttcc 

ccagagagct 

caggtggtgg 

aacaactctg 

gctaacttcc 

tttctttgcg 

gcactcacag 



ggcattagcg 

cttgatcccc 

gggcgtggcg 

gagcctccct 

cccatggcta 

gccgacaccc 

agccgcgcgt 

ggcaactggc 

cgacgatgct 

gccagaggag 

tgccgcagtg 

ggctggtgca 

accccacggt 

ggttcatccc 

ctgtgggtga 

tggacagggg 

agggtgcccc 

aagaagagga 

acgtatttgg 

gtgtcgaagg 

gagaatctca 

ctaccccaat 

aaggcattct 

tctcaactac 

acagtgaaga 

aggaaggtaa 

ttgcttttag 

ttctggcggg 

age tt teat a 

ggaatcaggc 

aagatgaact 

ccaaggtaaa 

cagataaaaa 

acctgtggag 

acagtccttc 

gagtggtggc 

actatatgga 

caatg^tgct 

ttgccattaa 

aggatgttaa 

gcattaagtt 

cctccacgct 

tctccgccgc 

tcccctcccg 

aaaccccatc 

acctagatgg 

cttctgtcag 

gctcagttga 

tggtacaaat 

acaaggaccg 

atgatactaa 

gatttgttgg 

cctatccgct 

atcgcaccct 

acctgggcca 



gggtggtgga 

gccctcccgt 

aaaagcgtcc 

cgtacgagga 

ccggagtgct 

agcagaaacc 

ccctgcgccg 

aaagcacact 

tctgaatagc 

ctgctgagcc 

gtcttacatg 

gtttgcccgc 

ggcgcctacg 

tgtggaccgt 

taaccgtgtg 

ccctactttt 

aaatccttgc 

cgatgacaac 

gcaggcgcct 

tcaaacacct 

gtggtacgaa 

gaaaccatgt 

tgtaaagcaa 

tgaggcgacc 

tgtagatata 

ctcacgagaa 

ggacaatttt 

ccaagcatcg 

ccagcttttg 

tgttgacagc 

tccaaattac 

acctaaa'aca 

tgaaataaga 

aaatttcctg 

caacgtaaaa 

tcccgggtta 

caacgtcaac 

gggcaatggt 

aaacctcctt 

catggttctg 

tgatagcatt 

tgaggccatg 

caacatgctc 

caactgggcg 

actgggctcg 

aaccttttac 

ctggcctggc 

cggggagggt 

gctagctaac 

catgtactcc 

atacaaggac 

ctaccttgcc 

tataggcaag 

ttggcgcatc 

aaaccttctc 



cctggccaiac 
agaggagcct 
gcgccccgac 
ggcactaaag 
gggccagcac 
tgtgctgcca 
cgccgccagc 
gaacagcatc 
taacgtgtcg, 
gccgcgcgcc 
cacatctcgg 
gccaccgaga 
cacgacgtga 
gaggatactg 
ctggacatgg 
aagccctact 
gaatgggatgr 
gaagacgaag 
tattctggta 
aaatatgccg 
actgaaatta 
tacggttcat 
caaaatggaa 
gcaggcaatg 
gaaaccccag 
ctaatgggcc 
attggtctaa 
cagttgaatg 
cttgattcca 
tatgatccag 
tgctttccac 
ggtcaggaaa 
gttggaaata 
tactccaaca 
atttctgata 
gtggactgct 
ccatttaacc 
cgctatgtgc 
ctcctgccgg 
cagagctccc 
tgcctttacg 
cttagaaacg 
taccctatac 
gctttccgcg 
ggctacgacc 
ctcaaccaca 
aatgaccgcc 
tacaacgttg 
tacaacattg 
ttctttagaa 
taccaacagg 
cccaccatgc 
accgcagttg 
ccattctcca 
tacgccaact 



caggcagtgc 

ccaccggccg ' 

agggaagaaa 

caaggcctgc 

acacccgtaa 

ggcccgaccg 

ggtccgcgat 

gtgggtctgg 

tatgtgtgtc 

cgctttccaa 

gccaggacgc 

cgtacttcag . 

ccacagaccg 

cgtactcgta 

cttccacgta 

ctggcactgc 

aagctgctac 

tagacgagca 

taaatattac 

ataaaacatt 

atcatgcagc 

atgcaaaacc 

agctagaaag 

gtgataactt 

acactcatat 

aacaatctat 

tgtattacaa 

ctgttgtaga 

ttggtgatag 

atgttagaat 

tgggaggtgt 

atggatggga 

attttgccat 

tagcgctgta 

acccaaacac 

acattaacct 

accaccgcaa 

ccttccacat 

gctcatacac 

taggaaatga 

ccaccttctt 

acaccaacga 

ccgccaacgc 

gctgggcctt 

cttattacac 

cctttaagaa 

tgcttacccc 

cccagtgtaa 

gctaccaggg 

acttccagcc 

tgggcatcct 

gcgaaggaca 

acagcattac 

gtaactttat 

ccgcccacgc 



18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

189Q0 

18960 

19020 

19080 

19140 

19200 

19260 

19320 

19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 

20400 

20460 

20520 

20580 

20640 

20700 

20760 

20820 

20880 

20940 

21000 

21060 

21120 

21180 

21240 

21300 

21360 

21420 

21480 

21540 
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gctagacatg 
tgaagtcttt 
cctgcgcacg 
acagctgccg 
tgtgggccat 
aagctcgcct 
gcctttgcct 
gaccagcgac 
attgcttctt 
cccaactcgg 
ccccaaactc 
atgctcaaca 
ttcctggagc 
tctttttgtc 
aaatgctttt 
: gtttaaaaat 
cgatactggt 
aagttttcac 
atcttgaagt 
cagcactgga 
atcagatccg 
tgccttccca 
aaaaggtgac 
^ tgcttaaaag 
gaaaactgat 
atctgcacca 
ttcagcgcgc 
; atcataatgc 
cacaacgcgc 
tacgcctgca 
tgcaacccgc 
tggtcaggca 
agcgcgcgcg 
ttcatcaccg 
ataccacgcg 
ccatgcttga 
ctttcttcct 
gggcgcttct 
gggctgggtg 
atacgccgcc 
gacgacacgt 
gtttcgcgct 
atggagtcag 
tccaccgatg 
gaggaagtga 
gtaccaacag 
gggcgggggg 
catctgcagc 
ctcgccatag 
cccaaacgcc 
tttgccgtgc 
ctatcctgcc 
gtcatacctg 
gacgagaagc 
ggagtgttgg 



acttttgagg 
gacgtggtcc 
cccttctcgg 
ccatgggctc 
attttttggg 
gcgccatagt 
ggaacccgca 
tcaagcaggt 
cccccgaccg 
ccgcctgtgg 
ccatggatca 
gtccccaggt 
gccactcgcc 
acttgaaaaa 
atttgtacac 
caaaggggtt 
gtttagtgct 
tccacaggct 
cgcagttggg 
acactatcag 
cgtccaggtc 
aaaagggcgc 
cgtgcccggt 
ccacctgagc 
tggccggaca 
catttcggcc 
gctgcccgtt 
ttccgtgtag 
agcccgtggg 
ggaatcgccc 
ggtgctcctc 
gtagtttgaa 
cagcctccat 
taatttcact 
ccactgggtc 
ttagcaccgg 
cgctgtccac 
ttttcttctt 
tgcgcggcac 
tcatccgctt 
cctccatggt 
gctcctcttc 
tcgagaagaa 
ccgccaacgc 
ttatcgagca 
aggataaaaa 
acgaaaggca 
gccagtgcgc 
cggatgtcag 
aagaaaacgg 
cagaggtgct 
gtgccaaccg 
atatcgcctc 
gcgcggcaaa 
tggaactcga 



tggatcccat 
gtgtgcaccg 
ccggcaacgc 
cagtgagcag 
cacctatgac 
caatacggcc 
ctcaaaaaca 
ttaccagttt 
ctgtataacg 
actattctgc 
caaccccacc 



acagcccacc 
ctacttccgc 
catgtaaaaa 
tctcgggtga 
ctgccgcgca 
ccacttaaac 
gcgcaccatc 
gcctccgccc 
cgccgggtgg 
ctccgcgttg 
gtgcccaggc 
ctgggcgtta 
ctttgcgcct 
ggccgcgtcg 
ccaccggttc 
ttcgctcgtc 
acacttaagc 
ctcgtgatgc 
catcatcgtc 
gttcagccag 
gttcgccttt 
gcccttctcc 
ttccgcttcg 
gtcttcattc 
tgggttgctg 
gattacctct 
gggcgcaatg 
cagcgcgtct 
ttttgggggc 
tgggggacgt 
ccgactggcc 
ggacagccta 
gcctaccacc 
ggacccaggt 
gcaagaccag 
tggcgactac 
cattatctgc 
ccttgcctac 
cacatgcgag 
tgccacctat 
cagccgagcg 
gctcaacgaa 
cgctctgcaa 
gggtgacaac 



ggacgagccc 
gccgcaccgc 
cacaacataa 
gaactgaaag 
aagcgctttc 
ggtcgcgaga 
tgctacctct 
gagtacgagt 
ctggaaaagt 
tgcatgtttc 
atgaacctta 
ctgcgtcgca 
agccacagtg 
taatgtacta 
ttatttaccc 
tcgctatgcg 
tcaggcacaa 
accaacgcgt 
tgcgcgcgcg 
tgcacgctgg 
ctcagggcga 
tttgagttgc 
ggatacagcg 
tcagagaaga 
tgcacgcagc 
ttcacgatct 
acatccattt 
tcgccttcga 
ttgtaggtca 
acaaaggtct 
gtcttgcata 
agatcgttat 
cacgcagaca 
ctgggctctt 
agccgccgca 
aaacccacca 
ggtgatggcg 
gccaaatccg 
tgtgatgagt 
gcccggggag 
cgcgccgcac 
atttccttct 
accgccccct 
ttccccgtcg 
tttgtaagcg 
gacaacgcag 
ctagatgtgg 
gacgcgttgc 
gaacgccacc 
cccaacccgc 
cacatctttt 
gacaagcagc 
gtgccaaaaa 
caggaaaaca 
gcgcgcctag 



acccttcttt 



ggcgtcatcg 
agaagcaagc 
ccattgtcaa 
caggctttgt 
ctgggggcgt 
ttgagccctt 
cactcctgcg 
ccacccaaag 
tccacgcctt 
ttaccggggt 
accaggaaca 
cgcagattag 
gagacacttt 
ccacccttgc 
ccactggcag 
ccatccgcgg 
ttagcaggtc 
agttgcgata 
ccagcacgct 
acggagtcaa 
actcgcaccg 
cctgcataaa 
acatgccgca 
accttgcgtc 
tggccttgct 
caatcacgtg 
tctcagcgca 
cctctgcaaa 
tgttgctggt 
cggccgccag 
ccacgtggta 
cgatcggcac 
cctcttcctc 
ctgtgcgctt 
tttgtagcgc 
ggcgctcggg 
ccgccgaggt 
cttcctcgtc 
gcggcggcga 
cgcgtccgcg 
cctataggca 
ctgagttcgc 
aggcaccccc 
aagacgacga 
aggcaaacga 
gagacgacgt 
aagagcgcag 
tattctcacc 
gcctcaactt 
tccaaaactg 
tggccttgcg 
tctttgaggg 
gcgaaaatga 
ccgtactaaa 



atgttttgtt 
aaaccgtgta 
aacatcaaca 
agatcttggt 
ttctccacac 
acactggatg 
tggcttttct 
ccgtagcgcc 
cgtacagggg 
tgccaactgg 
acccaactcc 
gctctacagc 
gagcgccact 
caataaaggc 
cgtctgcgcc 
ggacacgttg 
cage t egg tg 
gggcgccgat 
cacagggttg 
cttgtcggag 
ctttggtagc 
tagtggcatc 
agccttgatc 
agacttgccg 
ggtgttggag 
agactgctcc 
ctccttattt 
gcggtgcagc 
cgactgcagg 
gaaggtcagc 
agcttccact 
cttgtccatc 
actcagcggg 
ttgcgtccgc 
acctcctttg 
cacatcttct 
cttgggagaa 
cgatggccgc 
ctcggactcg 
cggggacggg 
ctcgggggtg 
gaaaaagatc 
caccaccgcc 
gcttgaggag 
ggaccgctca 
ggaacaagtc 
gctgttgaag 
cgatgtgccc 
gcgcgtaccc 
ctaccccgta 
caagataccc 
gcagggcgct 
tcttggacgc 
aagtcactct 
acgcagcatc 



21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 
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gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc 24900 

atgagtgagc tgatcgtgcg ccgtgcgcag cccctggaga gggatgcaaa tttgcaagaa 24960 

caaacagagg agggcctacc cgcagttggc gacgagcagc tagcgcgctg gcttcaaacg 25020 

cgcgagcctg ccgacttgga ggagcgacgc aaactaatga tggccgcagt gctcgttacc 25080 

gtggagcttg agtgcatgca gcggttcttt gctgacccgg agatgcagcg caagctagag 25140 

gaaacattgc actacacctt tcgacagggc tacgtacgcc aggcctgcaa gatctccaac 252 00 

gtggagctct gcaacctggt ctcctacctt ggaattttgc acgaaaaccg ccttgggcaa 25260 

aacgtgcttc attccacgct caagggcgag gcgcgccgcg actacgtccg cgactgcgtt 25320 

tacttatttc tatgctacac ctggcagacg gccatgggcg tttggcagca gtgcttggag 25380 

gagtgcaacc tcaaggagct gcagaaactg ctaaagcaaa acttgaagga cctatggacg 25440 

gccttcaacg agcgctccgt ggccgcgcac ctggcggaca tcattttccc cgaacgcctg 25500 

cttaaaaccc tgcaacaggg tctgccagac ttcaccagtc aaagcatgtt gcagaacttt 25560 

aggaacttta tcctagagcg ctcaggaatc ttgcccgcca cctgctgtgc acttcctagc 25620 

gactttgtgc ccattaagta ccgcgaatgc cctccgccgc tttggggcca ctgctacctt 25680 

ctgcagctag ccaactacct tgcctaccac tctgacataa tggaagacgt gagcggtgac 25740 

ggtctactgg agtgtcactg tcgctgcaac ctatgcaccc cgcaccgctc cctggtttgc 25800 

aattcgcagc tgcttaacga aagtcaaatt atcggtacct ttgagctgca gggtccctcg 25860 

cctgacgaaa agtccgcggc tccggggttg aaactcactc cggggctgtg gacgtcggct 25920 

taccttcgca aatttgtacc tgaggactac cacgcccacg agattaggtt ctacgaagac 25980 

caatcccgcc cgccaaatgc ggagcttacc gcctgcgtca ttacccaggg ccacattctt 26040 

ggccaattgc aagccatcaa caaagcccgc caagagtttc tgctacgaaa gggacggggg 26100 

gtttacttgg acccccagtc cggcgaggag ctcaacccaa tccccccgcc gccgcagccc 26160 

tatcagcagc agccgcgggc ccttgcttcc caggatggca cccaaaaaga agctgcagct 26220 

gccgccgcca cccacggacg aggaggaata ctgggacagt caggcagagg aggttttgga 26280 

cgaggaggag gaggacatga tggaagactg ggagagccta gacgaggaag cttccgaggt 2634Q 

cgaagaggtg tcagacgaaa caccgtcacc ctcggtcgca ttcccctcgc cggcgcccca 26400 

gaaatcggca accggttcca gcatggctac aacctccgct cctcaggcgc cgccggcact 26460 

gcccgttcgc cgacccaacc gtagatggga caccactgga accagggccg gtaagtccaa 26520 

gcagccgccg ccgttagccc aagagcaaca acagcgccaa ggctaccgct catggcgcgg 26580 

gcacaagaac gccatagttg cttgcttgca agactgtggg ggcaacatct ccttcgcccg 2 6640 

ccgctttctt ctctaccatc acggcgtggc cttcccccgt aacatcctgc attactaccg 26700 

tcatctctac agcccatact gcaccggcgg cagcggcagc ggcagcaaca gcagcggcca 26760 

cacagaagca aaggcgaccg gatagcaaga ctctgacaaa gcccaagaaa tccacagcgg 26820 

cggcagcagc aggaggagga gcgctgcgtc tggcgcccaa cgaacccgta tcgacccgcg '.. 26880 

agcttagaaa caggattttt cccactctgt atgctatatt tcaacagagc aggggccaag 26940' 

aacaagagct gaaaataaaa aacaggtctc tgcgatccct cacccgcagc tgcctgtatc 27000 

acaaaagcga agatcagctt cggcgcacgc tggaagacgc ggaggctctc ttcagtaaat 27060 

actgcgcgct gactcttaag gactagtttc gcgccctttc tcaaatttaa gcgcgaaaac 27120 

tacgtcatct ccagcggcca cacccggcgc cagcacctgt cgrtcagcgcc attatgagca 27180 

aggaaattcc cacgccctac atgtggagtt accagccaca aatgggactt gcggctggag 27240 

ctgcccaaga ctactcaacc cgaataaact acatgagcgc gggaccccac atgatatccc 27300 

gggtcaacgg aatccgcgcc caccgaaacc gaattctctt ggaacaggcg gctattacca 27360 

ccacacctcg taataacctt aatccccgta gttggcccgc tgccctggtg taccaggaaa 27420 

gtcccgctcc caccactgtg gtacttccca gagacgccca ggccgaagtt cagatgacta 27480 

actcaggggc gcagcttgcg ggcggctttc gtcacagggt gcggtcgccc gggcagggta 27540 

taactcacct gacaatcaga gggcgaggta ttcagctcaa cgacgagtcg gtgagctcct 27600 

cgcttggtct ccgtccggac gggacatttc agatcggcgg cgccggccgt ccttcattca 27660 

cgcctcgtca ggcaatccta actctgcaga cctcgtcctc tgagccgcgc tctggaggca 27720 

ttggaactct gcaatttatt gaggagtttg tgccatcggt ctactttaac cccttctcgg 27780 

gacctcccgg ccactatccg gatcaattta ttcctaactt tgacgcggta aaggactcgg 27840 

cggacggcta cgactgaatg ttaagtggag aggcagagca actgcgcctg aaacacctgg 27900 

. tccactgtcg ccgccacaag tgctttgccc gcgactccgg tgagttttgc tactttgaat 27960 

tgcccgagga tcatatcgag ggcccggcgc acggcgtccg gcttaccgcc caigggagagc 28020 

ttgcccgtag cctgattcgg gagtttaccc agcgccccct gctagttgag cgggacaggg 28080 

gaccctgtgt tctcactgtg atttgcaact gtcctaacct tggattacat caagatcttt 28140 
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gttgccatct 
cgccatcctg 
ggtactttta 
; ctacgagaga 
tgccgggaac 
ccagactttt 
aaaaccctta 
caactctacg 
tcttgtgatt 
tgtgcacatt 
aggtacataa 
gattttaagg 
cttataaaat 
aagtatgctg 
ttccagggta 
attaccatgt 
actggcactt 
ctctatatta 
actaagttac 
caaaaagtta 
accattcccc 
caggcttcct 
gtccaactac 
accggactta 
aacttgggca 
ctcatctgct 
ctacacccaa 
cttacagtat 
cgcttttttg 
cagccttcac 
tcactgtggt 
tcagacacca 
tatgaaattt 
gacctccaag 
ttgctacaat 
ggtgttctgc 
acgaatagat 
agttgttgcc 
tgaaatcagc 
. acggaattat 
gcatgaatca 
gtctggtaaa 
acaagttgcc 
taactcagca 
atctctgcac 
aaaaaaaaat 
ttcagcagca 
aactttctcc 
actatcttca 
gtgtatccat 
gtatccccca 
cctctagtta 
gaggccggca 
aagtcaaaca 
gtggctgccg 



ctgtgctgag 
taaacgccac 
acatctctcc 
acctctccga 
gtacgagtgc 
tccggacaga 
gggtattagg 
ggctattcta 
ctctttattc 
tgcatttatt 
tcctaggttt 
agccagcctg 
gcaccacaga 
tttatgctat 
aaagtcataa 
acatgagcaa 
tctgctgcac 
aatacaaaag 
aaagctaatg 
gcattataat 
tgaacaattg 
ggatgtcagc 
agcgacccac 
catctaccac 
tgtggtggtt 
gcctaaagcg 
acaatgatgg 
gattaaatga 
tgcgtgctcc 
agtctatttg 
catcgccttt 
tccccagtac 
actgtgactt 
cctcaaagac 
gaaaaaagcg 
agtaccatct 
gccatgaacc 
ggcggctttg 
tactttaatc 
tacagagcag 
agagctccaa 
gcaggccaaa 
aaccaagcgt 
ctcggtagaa 
ccttattaag 
aataaagcat 
cctccttgcc 
acaatctaaa 
tgttgttgca 
atgacacgga 
atgggtttca 
cctccaatgg 
accttacctc 
taaacctgga 
ccgcacctct 



tataataaat 
cgtcttcacc 
ctctgtgatt 
gctcagctac 
gtcaccggcc 
cctcaataac 
ccaaaggcgc 
attcaggttt 
ttatactaac 
gtcagctttt 
actcaccctt 
taatgttaca 
acatgaaaag 
ttggcagcca 
aacttttatg 
acagtataag 
tgctatgcta 
cagacgcagc 
tcaccactaa 
tagaatagga 
actctatgtg 
atctgacttt 
cctaacagag 
aaatacaccc 
ctccatagcg 
caaacgcgcc 
aatccataga 
gacatgattc 
acattggctg 
ctttacggat 
atccagtgca 
agggacagga 
ttctgctgat 
atatatcatg 
atctttccga 
tagccctagc 
acccaacttt 
tcccagccaa 
taacaggagg 
cgcctgctag 
gacatggtta 
gtcacctacg 
cagaaattgg 
accgaaggct 
accctgtgcg 
cacttactta 
ctcctcccag 
tggaatgtca 
gatgaagcgc 
aaccggtcct 
agagagtccc 
catgcttgcg 
ccaaaatgta 
aatatctgca 
aatggtcgcg 



acagaaatta 
cgcccaagca 
tacaacagtt 
tccatcagaa 
gctgcaccac 
tctgtttacc 
agctactgtg 
ctctagaatc 
gcttctctgc 
taaacgctgg 
gcgtcagccc 
ttcgcagctg 
ctgcttattc 
ggtgacacta 
tatacttttc 
ttgtggcccc 
attacagtgc 
tttattgagg 
ctgctttact 
tttaaacccc 
ggatatgctc 
ggccagcacc 
atgaccaaca 
caagtttctg 
cttatgtttg 
cgaccaccca 
ttggacggac 
ctcgagtttt 
cggtttctca 
ttgtcaccct 
ttgactgggt 
ctatagctga 
tatttgcacc 
cagattcact 
agcctggtta 
tatatatccc 
ccccgcgccc 
tcagcctcgc 
agatgactga 
aaagacgcag 
acttgcacca 
acagtaatac 
tggtcatggt 
gcattcactc 
gtctcaaaga 
aaatcagtta 
ctctggtatt 
gtttcctcct 
gcaagaccgt 
ccaactgtgc 
cctggggtac 
ctcaaaatgg 
accactgtga 
cccctcacag 
ggcaacacac 



aaatatactg 
aaccaaggcg 
tcaacccaga 
aaaacaccac 
acctaccgcc 
agaacaggag 
gggtttatga 

ggggttgggg 
ctaaggctcg 
ggtcgccacc 
acggtaccac 
aagctaatga 
gccacaaaaa 
cagagtataa 
cattttatga 
cacaaaattg 
tcgctttggt 
aaaagaaaat 
cgctgcttgc 
ccggtcattt 
cagcgctaca 
tgtcccgcgg 
caaccaacgc 
cctttgtcaa 
tatgccttat 
tctatagtcc 
tgaaacacat 
tatattactg 
catcgaagta 
cacgctcatc 
ctgtgtgcgc 
gcttcttaga 
ctatctgcgt 
cgtatatgga 
tatgcaatca 
taccttgaca 
gctatgcttc 
cccacttctc 
caccctagat 
ggcagcggcc 
gtgcaaaagg 
caccggacac 
gggagaaaag 
accttgtcaa 
tcttattccc 
gcaaatttct 
gcagcttcct 
gttcctgtcc 
ctgaagatac 
cttttcttac 
tctctttgcg 
gcaacggcct 
gcccacctct 
ttacctcaga 
tcaccatgca 



gggctcctat 
aaccttacct 
cggagtgagt 
cctccttacc 
tgaccgtaaa 
gtgagcttag 
acaattcaag 
ttattctctg 
ccgcctgctg 
caagatgatt 
ccaaaaggtg 
gtgcaccact 
caaaattggc 
tgttacagtt 
aatgtgcgac 
tgtggaaaac 
ctgtacccta 
gccttaattt 
aaaacaaatt 
cctgctcaat 
accttgaagt 
atttgttcca 
ggccgccgct 
taactgggat 
tattatgtgg 
catcattgtg 
gttcttttct 
acccttgttg 
gactgcattc 
tgcagcctca 
tttgcatatc 
attctttaat 
tttgttcccc 
atattccaag 
tctctgttat 
ttggctggaa 
cactgcaaca 
ccacccccac 
ctagaaatgg 
gagcaacagc 
ggtatctttt 
cgccttagct 
cccattacca 
ggacctgagg 
tttaactaat 
gtccagttta 
cctggctgca 
atccgcaccc 
cttcaacccc 
tcctcccttt 
cctatccgaa 
ctctctggac 
caaaaaaacc 
agccctaact 
atcacaggcc 



28200 

28260 

28320 

28380 

28440 

28500 

28560 

28620 

28680 

28740 

28800 

28860 

28920 

28980 

29040 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 
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ccgctaaccg tgcacgactc caaacttagc attgccaccc aaggacccct cacagtgtca 31500 

gaaggaaagc tagccctgca aacatcaggc cccctcacca ccaccgatag cagtaccctt 31560 

actatcactg cctcaccccc tctaactact gccactggta gcttgggcat tgacttgaaa 31620 

gagcccattit atacacaaaa tggaaaacta ggactaaagC acggggctcc tttgcatgta 31680 

acagacgacc taaacacttt gaccgtagca actggtccag gtgtgactat taataatact 31740 

tccttgcaaa ctaaagttac tggagccttg ggttttgatt cacaaggcaa tatgcaactt 31800 . 

aatgtagcag gaggactaag gattgattct caaaacagac gccttatact tgatgttagt 31860 

tatccgtttg atgctcaaaa ccaactaaat ctaagactag gacagggccc tctttttata 31920 

aactcagccc acaacttgga tattaactac aacaaaggcc tttacttgtt tacagcttca 31980 

aacaattcca aaaagcttga ggttaaccta agcactgcca aggggttgat gtttgacgct 32040 

. acagccatag ccattaatgc aggagatggg cttgaatttg gttcacctaa tgcaccaaac 32100 

acaaatcccc tcaaaacaaa aattggccat ggcctagaat ttgattcaaa caaggctatg 32160 

gttcctaaac taggaactgg ccttagtttt gacagcacag gtgccattac agtaggaaac 3222 0 

aiaaaataatg ataagctaac tttgtggacc acaccagctc catctcctaa ctgtagacta 32280 

aatgcagaga aagatgctaa actcactttg gtcttaacaa aatgtggcag tcaaatactt 32340 

gctacagttt cagttttggc tgttaaaggc agtttggctc caatatctgg aacagttcaa 32400 

agtgctcatc ttattataag atttgacgaa aatggagtgc tactaaacaa ttccttcctg 32460 

gacccagaat attggaactt tagaaatgga gatcttactg aaggcacagc ctatacaaac 32520 

gctgttggat ttatgcctaa cctatcagct tatccaaaat ctcacggtaa aactgccaaa 32580 

agtaacattg tcagtcaagt ttacttaaac ggagacaaaa ctaaacctgt aacactaacc 32640 

attacactaa acggtacaca ggaaacagga gacacaactc caagtgcata ctctatgtca 32700 

ttttcatggg actggtctgg ccacaactac attaatgaaa tatttgccac atcctcttac 32760 

actttttcat acattgccca agaataaaga atcgtttgtg ttatgtttca acgtgtttat 32820 

ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32880 

tagcttatac agatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32940 

acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33000 

catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc . 33060 

caaacgctca tcagtgatat taataaactc cccgggcagc tcacttaagt tcatgtcgct 33120 

gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgcttaacgg gcggcgaagg 33i80 

agaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33240 

ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33300 

ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc ataaggcgcc ttgtcctccg 33360 

ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33420 

aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33480 

agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33540 

cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33600 

tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33660 

ctgcccgccg gctatacact gcagggaacc gggactggaa caatgacagt ggagagccca 33720 

ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 337 80 

cacgtgcata cacttcctca ggattacaag ctcctcccgc gttagaacca tatcccaggg 33840 

aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33900 

cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33960 

agcgcgggtt tctgtctcaa aaggaggtag acgatcccta ctgtacggag tgcgccgaga 34020 

caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34080 

tcctgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgccgct 34140 

tagatcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 342 00 

tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34260 

cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34320 

cgggaagagc tggaagaacc atgttttttt ttttattcca aaagattatc caaaacctca 34380 

aaatgaagat ctattaagtg aacgcgctcc cctccggtgg cgtggtcaaa ctctacagcc 34440 

aaagaacaga taatggcatt tgtaagatgt tgcacaatgg cttccaaaag gcaaacggcc 34500 

ctcacgtcca agtggacgta aaggctaaac ccttcagggt gaatctcctc tataaacatt 34560 

ccagcacctt caaccatgcc caaataattc tcatctcgcc accttctcaa tatatctcta 34620 

agcaaatccc gaatattaag tccggccatt gtaaaaatct gctccagagc gccctccacc 34680 

ttcagcctca agcagcgaat catgattgca aaaattcagg ttcctcacag acctgtataa 34740 
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gattcaaaag 
gctgaacata 
, tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaat 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



cggaacatta 

atcgtgcagg 

acccacactg 

agctttgttg 

cctcgcgcaa 

ccggaaccac 

taaacacaaa 

aaaaacaacc 

ctggtcaccg 

gtaagactcg 

agcccggggg 

taacaaaatt 

aaatagcacc 

agccttacca 

aatcagtcac 

gacgtaacgg 

aaacgaaagc 

gtaacttccc 

acctacgtca 

tatcatattg 



acaaaaatac 

tctgcacgga 

attatgacac 

catgggcggc 

aaaagaaagc 

cacagaaaaa 

ataaaataac 

cttataagca 

tgattaaaaa 

gtaaacacat 

aatacatacc 

aataggagag 

ctcccgctcc 

gtaaaaaaga 

^gtgtaaaaa 

ttaaagtcca 

caaaaaaccc 

attttaagaa 

cccgccccgt 

gcttcaatcc 



cgcgatcccg 

ccagcgcggc 

gcatactcgg 

gatataaaat 

acatcgtagt 

gacaccattt 

aaaaaaacat 

taagacggac 

gcaccaccga 

caggttgatt 

cgcaggcgta 

aaaaacacat 

agaacaacat 

aaacctatta 

agggccaagt 

caaaaaacac 

acaacttcct 

aactacaatt 

tcccacgccc 

aaaataaggt 



taggtccctt 

cacttccccg 

agctatgcta 

gcaaggtgct 

catgctcatg 

ttctctcaaa 

ttaaacatta 

tacggccatg 

cagctcctcg 

catcggtcag 

gagacaacat 

aaacacctga 

acagcgcttc 

aaaaaacacc 

gcagagcgag 

ccagaaaacc 

caaatcgtca 

cccaacacat 

cgcgccacgt 

atattattga 



cgcagggcca 

ccaggaacct 

accagcgtag 

gctcaaaaaa 

cagataaagg 

catgtctgcg 

gaagcctgtc 

ccggcgtgac 

gtcatgtccg 

tgctaaaaag 

tacagccccc 

aaaaccctcc 

acagcggcag 

actcgacacg 

tatatatagg 

gcacgcgaac 

cttccgtttt 

acaagttact 

cacaaactcc 

tgatg 



<210> 10 
.<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NSsuboptmut 



<400> 10 

gccaccatgg 

atcaccagcc 

accgctaccc 

ggagccggaa 

gtggatcagg 

acctgtggaa 

cgcaggggcg 

agcagcggag 

gtgtgtacca 

accatgcgca 

caggtggctc 

tacgccgctc 

ttcggcgctt 

accatcacca 

ggctgcagcg 

accaccatcc 

Srtggtgctgg 

gaggtggccc 

gccatccgcg 

gctgccaagc 

tcagtgatcc 

tacaccggag 

ttcagcctgg 

aggagccaga 

cctggcgaaa 



cccccatcac 
tgaccggacg 
agagcttcct 
gcaagaccct 
atctggtggg 
gcagcgacct 
attctcgcgg 
gacccctgct 
ggggcgtggc 

gccctgtgtt 
acctgcacgc 
agggctacaa 
acatgagcaa 
ccggagctcc 
gaggagccta 
tgggcattgg 
ccacagctac 
tgagcaacac 
gaggcaggca 
tgagcggact 
ccaccatcgg 
acttcgacag 
accccacctt 
ggcgcggacg 
ggccctctgg 



cgcctacagc 

cgacaagaac 

ggccacctgc 

ggccggaccc 

ctggcaggcc 

gtacctggtg 

aagcctgctg 

gtgtccttct 

caaagccgtg 

caccgacaac 

ccctaccgga 

ggtgctggtg 

ggcccatggc 

cgtgacctac 

cgacatcatc 

caccgtgctg 

ccctcctggc 

aggcgagatc 

cctgatcttc 

gggcatcaac 

cgatgtggtg 

cgtgatcgac 

caeca tcgaa 

caccggaagg 

catgttcgac 



cagcagacca 

caggtggagg 

gtgaacggcg 

aagggcccta 

cctcccggag 

acacgccacg 

agccctaggc 

ggccatgccg 

gattttgtgc 

agctctcccc 

tctggcaaga 

ctgaacccca 

atcgacccca 

agcacctacg 

atctgcgacg 

gatcaggccg 

agcgtgaccg 

cccttctacg 

tgccacagca 

gccgtggcct 

gtggtggcca 

tgcaacacct 

accaccaccg 

ggcaggcgcg 

agcagcgtgc 



ggggcctgct 

gagaggtgca 

tgtgctggac 

tcacccagat 

ccaggagcct 

ccgatgtgat 

ccgtgagcta 

tgggcatttt 

ccgtggaaag 

ctgccgtgcc 

gcaccaaggt 

gcgtggccgc 

acatccgcac 

gcaagttcct 

agtgccacag 

aaacagctgg 

tgccccatcc 

gcaaggccat 

agaagaagtg 

actacagggg 

ccgacgccct 

gcgtgaccca 

tgcctcagga 

gaatttatcg 

tgtgcgagtg 



gggctgcatc 

ggtggtgagc 

cgtgtaccac 

gtacaccaat 

gacaccctgt 

ccccgtgagg 

cctgaagggc 

tcgcgctgcc 

catggagacc 

ccaatcattc 

gcccgctgcc 

taccctgggc 

aggcgtgcgc 

ggccgatgga 

caccgacagc 

agccaggctg 

caatatcgag 

ccccatcgag 

cgacgagctg 

cctggacgtg 

gatgacaggc 

gaccgtggac 

tgctgtgagc 

ctttgtgacc 

ctacgacgct 



34800 

34860 

34920 

34980 

35040 

35100 

35160 

35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35935 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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ggctgcgctt ggtacgagct gacacccgct gaaaccagcg tgcgcctgcg cgcttatctg 1560 

aatacccctg gcctgcccgt gtgtcaggac cacctggagt tctgggagag cgtgttcaca 1620 

ggactgaccc acatcgacgc ccatttcctg agccagacca agcaggctgg cgacaacttc liSSO 

ccctatctgg tggcctatca ggccaccgtg tgtgctaggg cccaagctcc acctccttca 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccctacccct 1800 

ctgctgtacc gcctgggagc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

. tacatcatgg cctgcatgag cgctgatctg gaagtggtga ccagcacctg ggtgctggtg 1920 

ggaggcgtgc tggccgctct ggctgcctac tgcctgacca ccggaagcgt ggtgatcgtg 1980 

ggacgcatca tcctgagcgg aaggcccgct atcgtgcccg atcgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgtgccagc cacctgccct acatcgagca gggcatgcag 2100 
ctggccgaac agttcaagca gaaggccctg ggcctgctgc agacagccac caaacaggcc - 2160 

\gaagctgccg ctcccgtggt ggaaagcaag tggagggccc tggagacctt ctgggctaag 2220 

cacatgtgga acttcatctc tggcatccag tacctggccg gactgagcac cctgcctggc 2280 

aaccccgcta tcgccagcct gatggccttc accgctagca tcacctctcc cctgaccacc 2340 

cagagcaccc tgctgttcaa cattctgggc ggatgggtgg ccgctcagct ggcccctcct 2400 

tcagctgctt ctgcctttgt gggcgctggc attgccggag ccgctgtggg cagcattggc 2460 

ctgggcaaag tgctggtgga tattctggct ggctatggcg ctggcgtggc cggagccctg 2520 

gtggccttca aggtgatgag cggagagatg cccagcaccg aggacctggt gaacctgctg 2580 

cctgccattc tgagccctgg agccctggtg gtgggcgtgg tgtgtgctgc cattctgagg 2640 

cgccatgtgg gacccggaga gggcgctgtg cagtggatga accgcctgat cgccttcgcc 2700 

tctcgcggaa accacgtgag ccctacccac tacgtgcctg agagcgacgc cgctgccagg 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac accctgcagc ggaagctggc tgagggacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccaactg 2940 

cctggcgtgc ccttcttctc atgccagcgc ggatacaagg gcgtgtggag gggcgatggc 3000 

atcatgcaga ccacctgtcc ctgcggagcc cagatcacag gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccctaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggaccctg cacacccagc cctgctccca actacagcag ggccctgtgg 3180 

agggtggctg ccgaggagta cgtggaggtg accagggtgg gagacttcca ctacgtgacc 3240 

ggaatgacca ccgacaacgt gaagtgtccc tgtcaggtgc ccgctcccga attttttacc 3300 

gaagtggatg gcgtgcgcct gcatcgctat gcccctgcct gtaggcccct gctgcgcgaa 3360 

gaagtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cctgagcccg atgtggccgt gctgaccagc atgctgaccg accccagcca catcacagcc 3480 

gaaaccgcta aaaggcgcct ggccaggggc tctcctccaa gcctggcctc aagcagcgct 3540 

agccagctgt ctgctcccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3 840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgctc gcctgatcgt gttccccgat 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgcctcag 4740 

gtggtgatgg gctcaagcta cggcttccag tacaqccctg gccagcgcgt ggagttcctg 4800 
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gtgaacacct 

gacagcaccg 

ctggcccctg 

cctctgacca 

ctgaccacca 

gctgccaagc 

gaaagcgctg 

aggtactctg 

agctgctcaa 

acacgcgatc 

gtgaacagct 

ctgatgaccc 

tgccagattt 

cgcctgcacg 

gtggccagct 

aggagcgtga 

ctgttcaact 

ctggacctga 

tctcgcgctc 

atctacctgc 



ggaagagcaa 

tgaccgagaa 

aggccaggca 

acagcaaggg 

gctgtggcaa 

tgcaggactg 

gcacccagga 

cccctcccgg 

gcaacgtgag 

ccaccacccc 

ggctgggcaa 

acttcttcag 

acggcgcttg 

gcctgtctgc 

gtctgcgcaa 

gggctaggct 

gggccgtgaa 

gcggatggtt 

gccctcgctg 

tgcccaaccg 



gaagaacccc 

cgacatccgc 

ggccatcaag 

acagaactgc 

caccctgacc 

caccatgctg 

agatgctgcc 

agacccccct 

cgtggctcac 

tctggctcgc 

catcatcatg 

catcctgctg 

ctacagcatc 

cttcagcctg 

actgggcgtg 

gctgagccag 

gaccaagctg 

cgtggctggc 

gttcatgctg 

ctaaa 



atgggcttca 

gtggaggaga 

agcctgaccg 

ggatacaggc 

tgctacctga 

gtgaacgccg 

agcctgcgcg 

cagcccgaat 

gacgccagcg 

gctgcctggg 

tacgccccta 

gctcaggagc 

gagcccctgg 

cacagctaca 

cctcctctgc 

ggaggcaggg 

aagctgaccc 

tacagcggag 

tgcctgctgc 



gctacgacac 

gcatctacca 

agcgcctgta 

gctgtagggc 

aggccagcgc 

ctggcctggt 

tgttcaccga 

acgacctgga 

gaaagcgcgt 

aaaccgctcg 

ccctgtgggc 

agctggagaa 

acctgcccca 

gccctggcga 

gcgtgtggag 

ccgctacctg 

ctatccctgc 

gcgacatcta 

tgctgagcgt 



acgctgcttc 
gtgctgcgac 
catcggaggc 
ctctggcgtg 
tgcctgtcgc 
ggtgatttgt 
ggccatgacc 
gctgatcacc 
gtactacctg 
ccatacaccc 
tcgcatgatc 
ggccctggac 
aatcatcgag 
aattaatcgc 
gcatagggct 
tggaaagtac 
cgctagccag 
ccacagcctg 
gggcgtgggc 



<210> 11 
<211> 5965 
<212> PNA 

<213> Artificial Sequence 
<220> 

<223> Chimeric NSsuboptmut 



<400> 11 

gccaccatgg 

atcaccagcc 

accgccaccc 

ggcgccggca 

gtggaccagg 

acctgcggca 

cgccgcggcg 

agcagcggcg 

Sftgtgcaccc 

accatgcgca 

caggtggccc 

tacgccgccc 

ttcggcgcct 

accatcacca 

ggctgcagcg 

accaccatcc 

gtggtgctgg 

gaggtggccc 

gccatccgcg 

gccgccaagc 

agcgtgatcc 

tacaccggcg 

ttcagcctgg 

cgcagccagc 

cccggcgagc 



cccccatcac 

tgaccggccg 

agagcttcct 

gcaagaccct 

acctggtggg 

gcagcgacct 

acagccgcgg 

gccccctgct 

gcggcgtggc 

gccccgtgtt 

acctgcacgc 

agggctacaa 

acatgagcaa 

ccggcgcccc 

gcggcgccta 

tgggcatcgg 

ccaccgccac 

tgagcaacac 

gcggccgcca 

tgagcggcct 

ccaccatcgg 

acttcgacag 

accccacctt 

gccgcggccg 

gccccagcgg 



cgcctacagc 

cgacaagaac 

ggccacctgc 

ggccggcccc 

ctggcaggcc 

gtacctggtg 

cagcctgctg 

gtgccccagc 

caaggccgtg 

caccgacaac 

ccccaccggc 

ggtgctggtg 

ggcccacggc 

cgtgacctac 

cgacatcatc 

caccgtgctg 

cccccccggc 

cggcgagatc 

cctgatcttc 

gggcatcaac 

cgacgtggtg 

cgtgatcgac 

caccatcgag 

caccggccgc 

catgttcgac 



cagcagaccc 

caggtggagg 

gtgaacggcg 

aagggcccca 

ccccccggcg 

acccgccacg 

agcccccgcc 

ggccacgccg 

gacttcgtgc 

agcagccccc 

agcggcaaga 

ctgaacccca 

atcgacccca 

agcacctacg 

atctgcgacg 

gaccaggccg 

agcgtgaccg 

cccttctacg 

tgccacagca 

gccgtggcct 

gtggtggcca 

tgcaacacct 

accaccaccg 

ggccgccgcg 

agcagcgtgc 



gcggcctgct 

gcgaggtgca 

tgtgctggac 

tcacccagat 

cccgcagcct 

ccgacgtgat 

ccgtgagcta 

tgggcatctt 

ccgtggagag 

ccgccgtgcc 

gcaccaaggt 

gcgtggccgc 

acatccgcac 

gcaagttcct 

agtgccacag 

agaccgccgg 

tgccccaccc 

gcaaggccat 

agaagaagtg 

actaccgcgg 

ccgacgccct 

gcgtgaccca 

tgccccagga 

gcatctaccg 

tgtgcgagtg 



gggctgcatc 

ggtggtgagc 

cgtgtaccac 

gtacaccaac 

gaccccctgc 

ccccgtgcgc 

cctgaagggc 

ccgcgccgcc 

catggagacc 

ccagagcttc 

gcccgccgcc 

caccctgggc 

cggcgtgcgc 

ggccgacggc 

caccgacagc 

cgcccgcctg 

caacatcgag 

ccccatcgag 

cgacgagctg 

cctggacgtg 

gatgaccggc 

gaccgtggac 

cgccgtgagc 

cttcgtgacc 

ctacgacgcc 



4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

5965 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



61/64 



W6 03/031588 



PCT/US02/32512 



ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc .1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg' cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtgai ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 

gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280. 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700: 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggccectg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 3300 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 
gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag ^ 3420 
cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca -catcaccgcc - 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgrc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aiggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag . 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 



62/64 



wo 03/031588 



PCT/US02/32512 



gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 

tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag 5580 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 

<210> 12 
<211> 10 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Ribosome binding site 

<400> 12 
gccaccaugg 

<210> 13 
<211> 49 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic polyadenylation signal 
<400> 13 

aauaaaagau cuuuauuuuc auuagaucug uguguugguu uuuugugug 49 

<210> 14 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVlJns-NS 
<400> 14 

tctagagcgt ttaaaccctt aattaagg 28 
<210> 15 
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<211> 15 

<2i/2> DNA 

<2i3> Artificial Sequence 



<220> 

<223> Additional nucleotides present in pVlJris-NSOPTmut 
<400> 15 

tttaaatgtt taaac 



15 



<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Oligonucleotide primer 
<400:^ 16 

tcgaatcgat acgcgaacct acgc 

<210> 17 
<211> 37 
<2l2> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 17 

tcgacgtgtc gacttcgaag cgcacaccaa aaacgtc 



24 



37 
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